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IMPROVED EXPRESSION. OF HIV POLYPEPTIDES AND 
PRODUCTION OF VIRUS-LIKE PARTICLES 



Technical Field 

Synthetic expression cassettes encoding the HIV 
polypeptides (e.g., Gag-, pol-, prot-, reverse 
transcriptase, Env- or tat -containing polypeptides) are 
described, as are uses of the expression cassettes. The 
present invention relates to the efficient expression of 
HIV polypeptides in a variety of cell types . Further, 
the invention provides methods of producing Virus-Like 
Particles (VLPs) , as well as, uses of the VLPs and high 
level expression of oligomeric envelope proteins. 



Background of the Invention 

Acquired immune deficiency syndrome (AIDS) is 
recognized as one of the greatest health threats facing 
modern medicine. There is, as yet, no cure for this 
20 disease. 

In 1983-1984, three groups independently identified 
the suspected etiological agent of AIDS. See, e.g., 
Barre-Sinoussi et al. (1983) Science 220:868-871; 
Montagnier et al . , in Human T-Cell Leukemia Viruses 
(Gallo, Essex & Gross, eds., 1984); Vilmer et al. (1984) 
The Lancet 1:753; Popovic et al . (1984) Science 
224:497-500; Levy et al . (1984) Science 225:840-842. 
These isolates were variously called lymphadenopathy- 
associated virus (LAV) , human T-cell lymphotropic virus 
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type III (HTLV-III), or AIDS-associated retrovirus (ARV) . 
All of these isolates are strains of the same virus, and 
were later collectively named Human Immunodeficiency 
Virus (HIV) . With the isolation of a related 
5 AIDS-causing virus, the strains originally called HIV are 
now termed HIV-1 and the related virus is called HIV- 2 
See, e.g., Guyader et al . (1987) Nature 326:662-669; 
Brun-Vezinet et al. (1986) Science 233:343-346; Clavel et 
al. (1986) Nature 324:691-695. 
10 A great deal of information has been gathered about 

the HIV virus, however, to date an effective vaccine has 
not been identified. Several targets for vaccine 
development have been examined including the env, Gag, 
pol and tat gene products encoded by HIV. 
15 Haas, et al., (Current Biology 6 (3) : 315-324 , 1996) 

suggested that selective codon usage by HIV-1 appeared to 
account for a substantial fraction of the inefficiency of 
viral protein synthesis. Andre, et al . , (j. Virol. 
72(2) :1497-1503, 1998) described an increased immune 
20 response elicited by DNA vaccination employing a 

synthetic gp!20 sequence with optimized codon usage. 
Schneider, et al., (J Virol. 71 (7) :4892-4903 , 1997) 
discuss inactivation of inhibitory (or instability) 
elements (INS) located within the coding sequences of the 
25 Gag and Gag-protease coding sequences. 

The Gag proteins of HIV-1 are necessary for the 
assembly of virus -like particles. .HIV-1 Gag proteins are 
involved in many stages of the life cycle of the virus 
including, assembly, virion maturation after particle - 
release, and early post-entry steps in virus replication. 
The roles of HIV-1 Gag proteins are numerous and complex 
(Freed, E.O., Virology 251 : 1-15 , 1998). 
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Wolf, et al., (PCT international Application, WO 
96/30523, published 3 October 1996; European Patent 
Application, Publication No. 0 449 116 Al, published 2 
October 1991) have described the use of altered P r55 Gag 
of HIV-i to act as a non-infectious retroviral -like 

particulate carrier, in particular- f>>v 

, xii particular, for the presentation 

> of immunologically important epitopes. Wang, et al 

(Virology 200:524-534, 1994) describe a system to study 
assembly of HIV Ga g - 3 -galactosidase fusion proteins into 
Vin ° nS - The y Ascribe the construction of sequences 
encoding HIV Gag-0-galactosidase fusion proteins, the 
expression of such sequences in the presence of HIV Gag 
proteins, and assembly of these proteins into virus 
particles . 

Recently, Shiver, et al . , (pct International 
Application, WO 98/34640, published 13 August 1998) 
described altering HIV-1 (CAMl) Gag coding sequences to 
produce synthetic DNA molecules encoding HIV Gag and 
modifications of HIV Gag. The codons of the synthetic 
molecules were codons preferred by a projected host cell. 

The envelope protein of HIV-1 is a glycoprotein of 
about 160 kD (gpi60) ; During virus infection of the host 
cell, gpi60 is cleaved by host cell proteases to form 
9P120 and the integral membrane protein, gp41. The gp 4 i 
portion is anchored in (and spans) the membrane bilayer 
of virion, while the gpl20 segment protrudes into the 
surrounding environment. As there is no covalent 
attachment between gpi20 and gp 4 l, free gpl20 is released 
from the surface of virions and infected cells. 

Haas, et al., (Current Biology 6 (3) :315-324, 1996) 
suggested that selective codon usage by HIV-1 appeared to 
account for a substantial fraction of the inefficiency of 
viral protein synthesis. Andre, et al . , (J. v iroI . 
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72 (2 ):1497 1503 , 19M) described an increased immU ne 
response elicited by DNA vaccination employing a 
synthetic g P i 20 sequence with optimized codon usage. 

Summary op the Invention 

The present invention relates ^ 
of HIV to., tat-, p^., prot ._ reverse transcriptase, or 
^-containing polypeptides and production of virus-like 



particles. 
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In one embodiment the present invention includes an 
expression cassette, comprising a polynucleotide encoding 
an HIV Gag polypeptide comprising a sequence having « 
east 90% sequence identity to the sequence presented as 
HQ IB uo-,0. r„ certain embodiments, the poiynucleot de 
sequence encoding said Sag polypeptide comprises a 
sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 9 or SEQ id NO:4 The 
expression cassettes may further include a polynucleotide 
sequence encoding an HIV protease polypeptide, for 
example a nucleotide sequence having at least 90% 
sequence identity to a sequence selected from the group 
consisting of: SEQ „ H0:S , SEQ „ NOi7e _ ^ ^ ^ 

NO: 79 . The expression cassettes may further include a 
polynucleotide sequence encoding an HIV reverse 
transcriptase polypeptide, for example a sequence having 

at least 90% sequence id*»nt-->.-„ *- 

quence identity to a sequence selected 

from the group consisting of: SEQ id NO:80, SEQ j D 
NO:81, SEQ ID NO:82, SEQ ID N 0: 83 , and SEQ ID NO:84 The 
expression cassettes may further include a polynucleQtide 
sequence encoding an HIV tat polypeptide, for example a 
sequence selected fro. the group consisting of: SEQ 1D 

cassettes may further include a polynucleotide sequence 
encoding an HIV pol^erase polypeptide, for example a 
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sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 6. The expression 
cassettes may include a polynucleotide sequence encoding 
an HIV polymerase polypeptide, wherein (i) the nucleotide 
i sequence encoding said polypeptide comprises a sequence 
having at least 90% sequence identity to the sequence 
presented as SEQ ID NO: 4, and (ii) wherein the sequence 
is modified by deletions of coding regions corresponding 
to reverse transcriptase and integrase. The expression 
cassettes described above may preserves T-helper cell and 
CTL epitopes. The expression cassettes' may further 
include a polynucleotide sequence encoding an HCV core 
polypeptide, for example a sequence having at least 90% 
sequence identity to the sequence presented as SEQ ID 
15 NO: 7. 

In another aspect, the invention includes an 
expression cassette, comprising a polynucleotide sequence 
encoding a polypeptide including an HIV Env polypeptide, 
wherein the polynucleotide sequence encoding said Env 
polypeptide comprises a sequence having at least 90% 
sequence identity to SEQ ID NO: 71 (Figure 58) or SEQ ID 
NO:72 (Figure 59). In certain embodiments, the Env 
expression cassettes includes sequences flanking a VI 
region but have a deletion in the VI region itself, for 
example the sequence presented as SEQ ID NO: 65 (Figure 
52, gpl 60. modUS 4. del VI) . m certain embodiments, the Env 
expression cassettes, include sequences flanking a V2 
region but have a deletion in the V2 region itself, for 
example the sequences shown in SEQ ID NO: 60 (Figure 47); 
SEQ ID NO:66 (Figure 53); SEQ ID NO:34 (Figure 20); SEQ 
ID NO:37 (Figure 24); SEQ ID NO.-40 (Figure 27); SEQ ID 
NO:43 (Figure 30); SEQ ID NO:46 (Figure 33); SEQ ID NO:76 
(Figure 64) and SEQ ID NO:49 (Figure 36). In certain 
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embodiments, the Env expression cassettes include 
sequences flanking a V1/V2 region but have a deletion in 
the VI /V2 region itself, for example, SEQ ID NO: 59 
(Figure .46); SEQ ID NO:61 (Figure 48); SEQ ID NO:67 
(Figure 54); SEQ ID NO:75 (Figure 63); SEQ ID NO:35 
(Figure 21); SEQ ID NO:38 (Figure 25); SEQ ID NO:41 
(Figure .28); SEQ ID NO:44 (Figure 31); SEQ ID NO:47 
(Figure 34) and SEQ ID NO:50 (Figure 37). The Env- 
encoding expression cassettes may also include a mutated 
cleavage site that prevents the cleavage of a gpl40 
polypeptide into a gpi20 polypeptide and a gp41 
polypeptide, for example, SEQ ID NO: 57 (Figure 44) ; SEQ 
ID NO:61 (Figure 48); SEQ ID NO:63 (Figure 50); SEQ ID 
NO:39 (Figure 26); SEQ ID N0:40 (Figure 27); SEQ ID NO:41 
(Figure 28); SEQ ID NO:42 (Figure 29); SEQ ID NO:43 
(Figure 30); SEQ ID NO:44 (Figure 31); SEQ ID NO:45 
(Figure 32); SEQ ID NO:46 (Figure 33); and SEQ ID NO.-47 
(Figure 34) . The Env expression cassettes may include a 
gpl60 Env polypeptide or a polypeptide derived from a 
gpl60 Env polypeptide, for example SEQ ID NO: 64 (Figure 
51); SEQ ID NO:65 (Figure 52); SEQ ID NO:66 (Figure 53); 
SEQ ID NO:67 (Figure 54); SEQ ID NO:68 (Figure 55); SEQ 
ID NO:75 (Figure 63); SEQ ID NO:73 (Figure 61); SEQ ID 
NO:48 (Figure 35); SEQ ID NO:49 (Figure 36); SEQ ID NO:50 
(Figure 37); SEQ ID NO:76 (Figure 64); and SEQ ID N0:74 
(Figure 62) . The Env expression cassettes may include a 
g P 14 0 Env polypeptide or a polypeptide derived from a 
gpl40 Env polypeptide, for example SEQ ID NO:56 (Figure 
43); SEQ ID NO:57 (Figure 44); SEQ ID NO:58 (Figure 45); 
SEQ ID NO:59 (Figure 46); SEQ ID NO:60 (Figure 47); SEQ 
ID NO:61 (Figure 48); SEQ ID NO:62 (Figure 49); SEQ ID 
NO:63 (Figure 50); SEQ ID NO:36 (Figure 23); SEQ ID NO:37 
(Figure 24); SEQ ID NO:38 (Figure 25); SEQ ID NO:39 
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(Figure 26); SEQ ID NO:40 (Figure 27); SEQ ID NO:41 
(Figure 28); SEQ ID NO:42 (Figure 29); SEQ ID NO:43 
(Figure 30); SEQ ID NO.-44 (Figure 31); SEQ ID NO:45 
(Figure 32); SEQ ID NO:46 (Figure 33); and SEQ ID NO:47 
i (Figure 34). The Env expression cassettes may also 

include a gpl20 Env polypeptide or a polypeptide derived 
from a gpl20 Env polypeptide, for example SEQ ID NO:54 
(Figure 41); and SEQ ID NO.-55 (Figure 42); SEQ ID NO:33 
(Figure 19); SEQ ID NO:34 (Figure 20); and SEQ ID NO:35 
(Figure 21) . The Env expression cassettes may include an 
Env polypeptide lacking the amino acids corresponding to 
residues 128 to about 194, relative to strains SF162 or 
US4, for example, SEQ ID NO:55 (Figure 42); SEQ ID NO:62 
(Figure 49); SEQ ID NO.-63 (Figure 50); and SEQ ID NO:68 
15 (Figure 55) . 

In another aspect, the invention includes a 
recombinant expression system for use in a selected host 
cell, comprising, one or more of the expression cassettes 
described herein operably linked to control elements 
2 0 compatible with expression in the selected host cell. The 
expression cassettes may be included on one or on 
multiple vectors and may use the same or different 
promoters. Exemplary control elements include a 
transcription promoter (e.g., CMV, CMV+intron A, SV40, 
25 RSV, HIV-Ltr, MMLV-ltr, and metallothionein) , a 
transcription enhancer element, a transcription 
termination signal, polyadenylat ion sequences', sequences 
for optimization of initiation of translation, and 
translation termination sequences. 

In another aspect, the invention includes a 
recombinant expression system for use in a selected host 
cell, comprising, any one of the expression cassettes 
described herein operably linked to control elements 



30 



7 



15 



25 



30 



WO 00/39302 PCT/US99/31245 

compatible with expression in the selected host cell. 
Exemplary control elements include, but are not limited 
to, a transcription promoter (e.g., CMV, CMV+intron A, 
SV40, RSV, HIV-LTR, MMLV-LTR, and metallothionein) , a 
5 transcription enhancer element, a transcription 

termination signal, polyadenylation sequences, sequences 
for optimization of initiation of translation, and 
translation termination sequences. 

In yet another aspect, the invention includes a cell 
10 comprising one or more of the expression cassettes 

described herein operably linked to control elements 
compatible with expression in the cell. The cell can be, 
for example, a mammalian cell (e.g., BHK, VERO, HT1080, 
293, RD, COS-7, or CHO cells), an insect cell (e.g., 
Trichoplusia ni (Tn5) or Sf9) , a bacterial cell, a plant 
cell, a yeast cell, an antigen presenting cell (e.g., 
primary, immortalized or tumor-derived lymphoid cells 
such as macrophages, monocytes, dendritic cells, B-cells, 
T-cells, stem cells, and progenitor cells thereof) . 

In another aspect, the invention includes methods 
for producing a polypeptide including HIV Gag-, prot- , 
pol-, reverse transcriptase, Env- or Tat -containing 
polypeptide sequences, said method comprising, incubating 
the cells comprising one or more the expression cassettes 
describe herein, under conditions for producing said 
polypeptide. 

In yet another aspect, the invention includes 
compositions for generating an immunological response, 
comprising one or more of the expression cassettes 
described herein. in certain embodiments, the 
compositions also include an adjuvant. 

In a still further aspect, the invention includes 
methods of generating an immune response in a subject, 
comprising introducing a composition comprising one or 



20 



8 



15 



20 



25 



30 



WO 00/39302 

PCT/US99/31245 

more of the expression cassettes described herein into 
the subject under conditions that are compatible with 
-expression of said expression cassette in the subject. 
; In certain embodiments, the expression cassette is 
5 introduced using a gene delivery vector. More than one 
expression cassette may be introduced using one or more 
gene delivery vectors. 

In yet another aspect, the invention includes a 
purified polynucleotide comprising a polynucleotide 
sequence encoding a polypeptide including an HIV Env 
polypeptide, wherein the polynucleotide sequence encoding 
said Env polypeptide comprises a sequence having at least 
90% sequence identity to SEQ ID NO: 71 (Figure 58) or SEQ 
ID NO: 72 (Figure 59). Further exemplary purified 
polynucleotide sequences were presented above. 

The polynucleotides of the present invention can be 
produced by recombinant techniques, synthetic techniques,, 
or combinations thereof. 

In another embodiment, the invention includes a 
method for producing a polypeptide including HIV Gag 
polypeptide sequences, where the method comprises 
incubating any of the above cells containing an 
expression cassette of interest under conditions for 
producing the polypeptide. 

The invention further includes, a method for 
producing virus-like particles. (VLPs) where the method 
comprises incubating any of the above -described cells 
containing an expression cassette of interest under 
conditions for producing VLPs . 

In another aspect the invention includes a method 
for producing a composition of virus-like particles 
(VLPs) where, any of the above -described cells containing 
an expression cassette of interest are incubated under 
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conditions for producing VLPs, and the VLPs are 
substantially purified to produce a composition of VLPs. 

In a further embodiment of the present invention, 
packaging cell lines are produced using the expression 
cassettes of the present invention. For example, a cell 
line useful for packaging lentivirus vectors comprises 
suitable host cells that have an expression vector 
containing an expression cassette of the present 
invention wherein said polynucleotide sequence is ' 
operably linked to control elements compatible with 
expression in the host cell. In a preferred embodiment, 
such host cells may be transfected with one or more ' 
expression cassettes having a polynucleotide sequence 
that encodes an HIV polymerase polypeptide or 
15 polypeptides derived therefrom,, for example, where the 

nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 6. Further, the HIV 
polymerase polypeptide may be modified by deletions of 
2 0 coding regions corresponding to reverse transcriptase and 
integrase. Such a polynucleotide sequence may preserve 
T-helper cell and CTL epitopes, for example when used in 
a vaccine application. In addition, the polynucleotide 
sequence may also include other -polypeptides . Further, 
polynucleotide sequences encoding additional polypeptides 
whose expression are useful for packaging cell line 
function may also be utilized. 

In another aspect, the present invention includes a 
gene delivery or vaccine vector for use in a subject, 
where the vector is a suitable gene delivery vector for 
use in the subject, and the vector comprises one or more 
of any of the expression cassettes of the present 
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invention where the polynucleotide sequences of interest 
are operably linked to control elements compatible with 
expression in the subject. Such gene delivery vectors 
can be used in a method of DNA immunization of a subject, 
for example, by introducing a gene delivery vector into 
the subject under conditions that are compatible with 
expression of the expression cassette in the subject. 
Gene delivery vectors useful in the practice of the 
present invention include, but are not limited to, 
nonviral vectors, bacterial plasmid vectors, viral 
vectors, particulate carriers (where the vector is coated 
on a polylactide co-glycolide particles, gold or tungsten 
particle, for example, the coated particle can be 
delivered to a subject cell using a gene gun), liposome 
preparations, and viral vectors (e.g., vectors derived 
from alphaviruses, pox viruses, and vaccinia viruses, as 
well as, retroviral vectors, including, but not limited 
to, lentiviral vectors) . Alphavirus -derived vectors 
include, for example, an alphavirus cDNA construct, a 
recombinant alphavirus particle preparation and a 
eukaryotic layered vector initiation system. in one 
embodiment, the subject is a vertebrate, preferably a 
mammal, and in a further embodiment the subject is a 
human . 

25 The invention further includes a method of 

generating an immune response in a subject, where cells 
of a subject are transfected with any of the above- 
described gene delivery vectors (e.g., alphavirus 
constructs; alphavirus cDNA constructs; eukaryotic 
layered vector initiation systems (see, e.g., U.S. Patent 
Number 5,814,482 for description of suitable eukaryotic 
layered vector initiation systems) ; alphavirus particle 
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preparations; etc.) under conditions that permit the 
expression of a selected polynucleotide and production of 
a polypeptide of interest (i.e., encoded by any 
expression cassette of the present invention) , thereby 
5 eliciting an immunological response to the polypeptide. 
Transfection of the cells may be performed ex vivo and 
the transfected cells are reintroduced into the subject. 
Alternately, or in addition, the cells may be transfected 
in vivo in the subject. The immune response may be 
10 humoral and/or cell-mediated (cellular) . 

Further embodiments of the present invention include 
purified polynucleotides. In one embodiment, the ' 
purified polynucleotide comprises a polynucleotide 
sequence having at least 90% sequence identity to the 
sequence presented as SEQ ID NO: 20, and complements 
thereof. in another embodiment, the purified 
polynucleotide comprises a polynucleotide sequence 
encoding an HIV Gag polypeptide, wherein the 
polynucleotide sequence comprises a sequence having at 
least 90% sequence identity to the sequence presented as 
SEQ ID NO:20, and complements thereof. in still another 
embodiment, the purified polynucleotide comprises a 
polynucleotide sequence encoding an HIV Gag polypeptide, 
wherein the polynucleotide sequence comprises a sequence 
having at least 90% sequence identity to the sequence 
presented as SEQ ID NO: 9, and complements thereof. m 
further embodiments the polynucleotide sequence comprises 
a sequence having at least 90% sequence identity to one 
of the following sequences: SEQ ID NO: 4, SEQ ID NO: 5, SEQ 
ID NO: 6, SEQ ID NO: 7, and complements thereof. 

The polynucleotides of the present invention can be 
produced by recombinant techniques, synthetic techniques, 
or combinations thereof. 
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These and other embodiments of the present invention 
vwill readily occur to those of ordinary skill in the art 
in view of the disclosure herein. 

> Brief Description of the Figures 

Figure 1 shows the locations of the inactivation 
.,. sites for the native HIV-1SF2 Gag protein coding 
sequence . 

Figure 2 shows the locations of the inactivation 
sites for the native HIV-1SF2 Gag-protease protein coding 
sequence . 

Figures 3A and 3B show electron micrographs of 
virus-like particles. Figure 3A shows immature . p55Gag 
virus-like particles in COS-7 cells transfected with a 
synthetic HIV-1 SFJ gag construct while Figure 3B shows 
mature (arrows) and immature VLP in cells transfected 
with a modified HIV-1 SF2 gagprotease construct (GP2 , SEQ 
ID NO: 70). Transfected cells were fixed at 24 h (gag) or 
48 h (gagprotease) post -transf ection and subsequently 
analyzed by electron microscopy (magnification at 
100.000X). Cells transfected with vector alone ( P CMVKm2) 
served as negative control (data not shown) . 

Figure 4 presents an image of samples from a series 
of fractions which were electrophoresed on an 8-16% SDS 
polyacrylamide gel and the resulting" bands visualized by 
commassie blue staining. The results show that the 
native p55 Gag virus -like particles (VLPs) banded at a 
sucrose density of range of 1.15 - 1.19 g/ml with the ^ 
peak at approximately 1.17 g/ml. 

Figure 5 presents an image similar to Figure 4 where 
the analysis was performed using Gag VLPs produced by a 
synthetic Gag expression cassette. 
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Figure 6 presents a comparison of the total amount 
of purified HIV P 55 Gag from several preparations 
obtained from two baculovirus expression cassettes 
encoding native and modified Gag. 
> Figure 7 presents an alignment of modified coding 

sequences of the present invention including a synthetic 
Gag expression cassette (SEQ ID NO:4), a synthetic Gag- 
protease expression cassette (SEQ ID NO: 5), and a 
synthetic Gag-polymerase expression cassette (SEQ ID 
NO: 6). A common region (Gag-common; SEQ ID NO: 9) extends 
from position 1 to position 1262. 

Figure 8 presents an image of wild-type Gag-HCV core 
expression samples from a series of fractions which were 
electrophoresed on an 8-16% SDS polyacryl amide gel and 
the resulting bands visualized by commassie staining. 

Figure 9 shows the results of Western blot analysis 
of the gel shown presented in Figure 8. 

Figure 10 presents results similar to those shown in 
Figure 9. The results in Figure 10 indicate that the 
main HCV Core-specific reactivity migrates at an 
approximate molecular weight of 72,000 kD, which is in 
accordance with the predicted molecular weight of the 
Gag-HCV core chimeric protein. 

Figures 11A to 11D present a comparison of AT 
content, in percent, of cDNAs corresponding to an 
unstable human mRNA (human IFNy mRNA; HA) , wild-type HIV 
Gag native RNA (11B) , a stable human mRNA (human GAPDH 
mRNA; lie) , and synthetic HIV Gag RNA (11D) . 

Figure 12 shows the location of the inactivation 
sites for the native HIV-1SF2 Gag-polymerase sequence. 
Figure 13 A presents a vector map of pESN2dhfr. 
Figure 13B presents a map of the pCMVIII vector. 
Figure 14 presents a vector map of pCMV-LINK. 
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Figure 15 presents a schematic diagram showing the 
relationships between the following forms of the HIV Env 
polypeptide: gpl60, gp!40, gpl20, and gp41. 

Figure 16 depicts the nucleotide sequence of wild- 
type gpl20 from SF162 (SEQ ID NO:30). 

Figure 17 depicts the nucleotide sequence of the 
wild-type gpl40 from SF162 (SEQ ID NO:31) . 

Figure 18 depicts the nucleotide sequence of the 
: wild-type gpl60 from SF162 (SEQ ID NO:32). 

Figure 19 depicts the nucleotide sequence of the 
construct designated gpl20 .modSF162 (SEQ ID NO;33) . 

Figure 20 depicts the nucleotide sequence of the 
construct designated gpl20 .modSF162 .delV2 (SEQ ID NO:34). 

Figure 21 depicts the nucleotide sequence of the 
construct designated gpl20 . modSF162 . delVl/V2 (SEQ ID 
NO:35). 

Figures 22A-H show the percent A-T content over the 
length of the sequences for IFNy (Figures 2C and 2G) ; 
native gpl60 Env US 4 and SF162 (Figures 2A and 2E, 
respectively) ; GAPDH (Figures 2D and 2H) ; and the 
synthetic gpl60 Env for US 4 and SF162 (Figures 2B and 2F, 
respectively) . 

Figure 23 depicts the nucleotide sequence of the 
construct designated gpl40 . modSF162 (SEQ ID NO:36). 

Figure 24 depicts the nucleotide sequence of the 
^construct designated gpl40 . modSF162 .delV2 (SEQ ID NO:37). 
Figure 25 depicts the nucleotide sequence of the 
construct designated gpl40 . modSF162 . delVl/V2 (SEQ ID 
NO:38) . 

Figure 26 depicts the nucleotide sequence of the 
construct designated gpl40 . mut . modSF162 (SEQ ID NO:39). 

Figure 27 depicts the nucleotide sequence of the 
construct designated gpl40 . mut .modSF162 . delV2 (SEQ ID 
NO: 40) . 
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Figure 28 depicts the nucleotide sequence of the 
construct designated gpl40.mut.modSF162 .delVl/V2 (SEQ ID 
NO:41) . 

Figure 29 depicts the nucleotide sequence of the 
construct designated gpl40 ,mut7 .modSF162 (SEQ ID NO:42) . 

Figure 30 depicts the nucleotide sequence of the 
construct designated gpl40 .mut7 .modSF162 .delV2 (SEQ ID 
NO:43). 

Figure 31 depicts the nucleotide sequence of the 
construct designated gpl40 .mut7 .modSF162 . delVl/V2 (SEQ ID 
NO: 44) . . 

Figure 32 depicts the nucleotide sequence of the 
construct designated gpl40 .mut8 .modSF162 (SEQ ID NO:45) . 

Figure 33 depicts the nucleotide sequence of the 
construct designated gpl40 .mut8 ,modSF162 . delV2 (SEQ ID 
NO:46) . 

Figure 34 depicts the nucleotide sequence of the 
construct designated gpl40 .mut8 .modSF162 . delVl/V2 (SEQ ID 
NO:47) . 

Figure 35 depicts the nucleotide sequence of the 
construct designated gpl60 .modSF162 (SEQ ID NO:48) . 

Figure 36 depicts the nucleotide sequence of the 
construct designated gpl60 .modSF162 .delV2 (SEQ ID NO:49) . 

Figure 37 depicts the nucleotide sequence of the 
construct designated gpl60 ,modSF162 .delVl/V2 (SEQ ID 
NO:50) . 

Figure 38 depicts the nucleotide sequence of the 
wild-type gpl20 from US4 (SEQ ID NO: 51) . 

Figure 39 depicts the nucleotide sequence of the 
wild-type gpl40 from US 4 (SEQ ID NO:52). 

Figure 40 depicts the nucleotide sequence of the 
wild- type gpl60 from US 4 (SEQ ID NO: 53) . 

Figure 41 depicts the nucleotide sequence of the 
construct designated gpl20.modUS4 (SEQ ID NO:54). 
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Figure 42 depicts the nucleotide sequence of the 
r construct designated gpl20 .modUS4 .del 128-194 (SEQ ID 
NO:55) . 

Figure 43 depicts the nucleotide sequence of the 
construct designated gpl40.modUS4 (SEQ ID NO:56). 

Figure 44 depicts the nucleotide sequence of the 
construct designated gpl40 .mut .modUS4 (SEQ ID N0:57) . 

Figure 4 5 depicts the nucleotide sequence of the 
construct designated gpl40.TM.modUS4 (SEQ ID NO:58). 

Figure 46 depicts the nucleotide sequence of the 
construct designated gpl40.modUS4 .delVl/V2 (SEQ ID 
NO: 59) . 

Figure 4 7 depicts the nucleotide sequence of the 
construct designated gpl40 .modUS4 .delV2 (SEQ ID NO:60). 

Figure 48 depicts the nucleotide sequence of the 
construct designated gpl4 0 .mut .modUS4 .delVl/V2 (SEQ ID 
N0:61) . 

Figure 49 depicts the nucleotide sequence of the 
construct designated gpl40 .modUS4 .del 128-194 (SEQ id 
20 NO:62) . 

Figure 50 depicts the nucleotide sequence of the 
construct designated gpl40 .mut .modUS4 .del 128-194 (SEQ ID 
NO: 63) . 

Figure 51 depicts the nucleotide sequence of the 
construct designated gpl60.modUS4 (SEQ ID NO:64). 

Figure 52 depicts the nucleotide sequence of the 
construct designated gpl60 .modUS4 .delVl (SEQ ID NO:65). 

Figure 53 depicts the nucleotide sequence of the 
construct designated g P 160 . modUS4 . delV2 (SEQ ID NO:66). 

Figure 54 depicts the nucleotide sequence of the 
construct designated gpl60 . modUS4 . delVl/V2 (SEQ ID 
NO:67) . 



25 



30 



17 



15 



WO 00/39302 

PGT/US99/31245 

Figure 55 depicts the nucleotide sequence of the 
construct designated gpl6G.modUS4 .del 128-194 (SEQ ID 
NO:68) . 

Figure 56 depicts the nucleotide sequence of the 
5 common region of Env from wild-type US 4 (SEQ ID NO: 69) . 

Figure 57 depicts the nucleotide sequence of the 
common region of Env from wild-type SF162 (SEQ ID NO: 70) . 

Figure 58 depicts the nucleotide sequence of 
synthetic sequences corresponding to the common region of 
10 Env from US4 (SEQ ID NO:71). 

Figure 59 depicts the nucleotide sequence of 
synthetic sequences corresponding to the common region of 
Env from SF162 (SEQ ID NO: 72) . 

Figure 60 presents a schematic representation of an 
Env polypeptide purification strategy. 

Figure 61 depicts the nucleotide sequence of the 
bicistronic construct designated gpl60.modUS4 .Gag.modSF2 
(SEQ ID NO: 73) . 

Figure 62 depicts the nucleotide sequence of the 
20 bicistronic construct designated 

gpl6 0.modSF162.Gag.modSF2 (SEQ ID NO : 74 ) . 

Figure 63 depicts the nucleotide sequence of the 
bicistronic construct designated gpl60 .modUS4 . - 
delVl/V2.Gag.modSF2 (SEQ ID NO:75). 

Figure 64 depicts the nucleotide sequence of the 
bicistronic construct designated 
gpl60.modSF162.delV2.Gag.modSF2 (SEQ IDNO:76). 

Figures 65A-65F show micrographs of 293T cells 
transfected with the following polypeptide encoding 
sequences: Figure 65A, gag. m odSF2; Figure 65B, 
gpl60 .modUS4 ; Figure 65C, 

g P 160.modUS4.delVl/V2.gag.modSF2 (bicistronic Env and 
Gag); Figures 65D and 65E, g P 160 . modUS4 . delVl/V2 and 
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gag.modSF2; and Figure 65F, gpl20.modSFl62 .delV2 and 
gag.modSF2. 

Figures 66A and 66B present alignments of selected 
modified coding sequences of the present invention 
5 including a common region defined for each group of 

synthetic Env expression cassettes. Figure 66A presents 
alignments of modified SF162 sequences. Figure 6 6B 
presents alignments of modified US 4 sequences. The SEQ 
ID NOs for these sequences are presented in Tables 1A and 
10 IB. 

Figure 67 shows the ELISA titers (binding 
antibodies) obtained in two rhesus macaques (H445, lines 
with solid black dots; and J408, lines with open 
squares) . The y-axis is the end-point gpl4 0 ELISA titers 
15 and the x-axis shows weeks post -immunization. The dashed 
lines at 0, 4, and 8 weeks represent DNA immunizations. 
The alternating dash/dotted line at 27 weeks indicates a 
DNA plus protein boost immunization. 

Figure 68 (SEQ ID NO: 77) depicts the wild- type 
nucleotide sequence of Gag reverse transcriptase from 
SF2. 

Figure 69 (SEQ ID NO: 78) depicts the nucleotide 
sequence of the construct designated GP1 . 

Figure 70 (SEQ ID NO: 79) depicts the nucleotide 
25 sequence of the construct designated GP2 . 

Figure 71 (SEQ ID NO: 80) depicts the nucleotide 
sequence of the construct designated 

FS(+) -protinact.RTopt.YM. FS( + ) indicates that there is 
a frameshift in the GagPol coding sequence. 

Figure 72 (SEQ ID NO: 81) depicts the nucleotide 
sequence of the construct designated 
FS(+) . protinact . RTopt . YMWM. 

Figure 73 (SEQ ID NO: 82) depicts the nucleotide 
sequence of the construct designated FS(- 
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) .protmod. RTopt. YM. FS(-) indicates that there is no 
frameshift in the GagPol coding sequence. 

Figure 74 (SEQ ID NO: 83) depicts the nucleotide 
sequence of the construct designated 
FS ( - ) . prottnod . RTopt . YMWM . 

Figure 75 (SEQ ID NO: 84) depicts the nucleotide 
sequence of the construct designated FS(- 
) . protmod . RTopt ( + ) . 

Figure 76 (SEQ ID NO: 85) depicts the nucleotide 
sequence of wild type Tat from isolate SF162. 

Figure 77 (SEQ ID NO: 86) depicts the amino acid 
sequence of the tat polypeptide. 

Figure 78 (SEQ ID NO: 87) depicts the nucleotide 
sequence of a synthetic Tat construct designated 
15 Tat.SF162.opt. 

Figure 79 (SEQ ID NO: 88) depicts the nucleotide 
sequence of a synthetic Tat construct designated 
tat.cys22.sfl62.opt. The construct encodes a tat 
polypeptide in which the cystein residue at position 22 
of the wild type Tat polypeptide is replaced by a glycine 
residue . 

Figures 80A to 80E are an alignment of the 
nucleotide sequences of the constructs designated 
Gag.mod.SF2, GP1 (SEQ ID NO.-78), and GP2 (SEQ ID N0:79). 

Figure 81 (SEQ ID NO: 89) depicts the nucleotide 
sequence of the construct designated tataminoSF162 .opt, 
which encodes the amino terminus of that tat protein. 
The codon encoding the cystein- 22 residue is underlined. 

Figure 82 (SEQ ID NO: 90) depicts the amino acid 
sequence of the polypeptide encoded by the construct 
designated tat . cys22 . SF162 . opt (SEQ IDNO:88). 
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Detailed Description op the Invention 

The practice of the present invention will employ, 
unless otherwise indicated, conventional methods of 
chemistry, biochemistry, molecular biology, immunology 
5 and pharmacology, within the skill of the art. Such 
techniques are explained fully in the literature. See, 
e.g., .Remington's Pharmaceutical Sciences, 18th Edition 
(Easton, Pennsylvania :. Mack Publishing Company, 1990); 
Methods In Enzymology (S. Colowick and N. Kaplan, eds., 
10 Academic Press, Inc.); and Handbook of Experimental 

Immunology, Vols. I-IV (D.M. Weir and C.C. Blackwell, 
eds., 1986, Blackwell Scientific Publications); Sambrook, 
et al.. Molecular Cloning: A Laboratory Manual (2nd 
Edition, 1989) ; Short Protocols in Molecular Biology, 4th 
15 ed. (Ausubel et al . eds., 1999, John Wiley & Sons); 

Molecular Biology Techniques: An Intensive Laboratory 
Course, (Ream et al . , eds., 1998, Academic Press); PCR 
(Introduction to Biotechniques Series), 2nd ed. (Newton & 
Graham eds., 1997, Springer Verlag) . 

As used in this specification and the appended 
claims, the singular forms "a," "an" and "the" include 
plural references unless the content clearly dictates 
otherwise. Thus, for example, reference to "an antigen" 
includes a mixture of two or more such agents. 
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1 . Definitions 

In describing the present invention, the following 
terms will be employed, and are intended to be defined as 
indicated below. 

"Synthetic" sequences, as used herein, refers to 
Env-; tat- or Gag-encoding polynucleotides whose 
expression has been optimized as described herein, for 
example, by codon substitution, deletions, replacements 
and/or inactivation of inhibitory sequences. "Wild-type" 
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or "native" sequences, as used herein, refers to 
polypeptide encoding sequences that are essentially as 
they are found in nature, e.g., Gag encoding sequences as 
found in the isolate HIV-1SF2 or Env encoding sequences 
5 as found in the isolates HIV-1SF162 or HIV1US4 . 

As used herein, the term "virus-like particle" or 
"VLP" refers to a nonreplicating, viral shell, derived 
from any of several viruses discussed further below. 
VLPs are generally composed of one or more viral 
proteins, such as, but not limited to those proteins 
referred to as capsid, coat, shell, surface and/or ' 
envelope proteins, or particle- forming polypeptides 
derived from these proteins. VLPs can form spontaneously 
upon recombinant expression of the protein in an 
appropriate expression system. Methods for producing 
particular VLPs are known in the art and discussed more 
fully below. The presence of VLPs following recombinant 
expression of viral proteins can be detected using 
conventional techniques known in the art, such as by 
electron microscopy, biophysical characterization, and 
the like. See, e.g., Baker et al . , Biophys . J. (1991) 
60:1445-1456; Hagensee et al . , J. Virol. (1994) 68:4503- 
4505. For example, VLPs can be isolated by density 
gradient cent rifugat ion and/or identified by 
characteristic density banding (e.g., Example 7). 
Alternatively, cryoelectron microscopy can be performed 
on vitrified aqueous samples of the VLP preparation in 
question, and images recorded under appropriate exposure 
conditions. 

By "particle- forming polypeptide" derived from a 
particular viral protein is meant a full-length or near 
full-length viral protein, as well as a fragment thereof, 
or a viral protein with internal deletions, which has the 
ability to form VLPs under conditions that favor VLP 
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formation. Accordingly, the polypeptide may comprise the 
full-length sequence, fragments, truncated and partial 
sequences, as well as analogs and precursor forms of the 
reference molecule. The term therefore intends 
5 deletions, additions and substitutions to the sequence, 
so long as the polypeptide retains the ability to form a 
VLP. Thus, the term includes natural variations of the 
specified polypeptide since variations in coat proteins 
often occur between viral isolates. The term also 
10 "includes deletions, additions and substitutions that do 
not naturally occur in the reference protein, so long as 
the protein retains the ability to form a VLP. Preferred 
substitutions are those which are conservative in nature, 
i.e., those substitutions that take place within a family 
15 of amino acids that are related in their side chains. 

Specifically, amino acids are generally divided into four 
families: (1) acidic -- aspartate and glutamate; (2) 
basic -- lysine, arginine, histidine; (3) non-polar -- 
alanine, valine, leucine, isoleucine, proline, 
phenylalanine, methionine, tryptophan; and (4) uncharged 
polar -- glycine, asparagine, glutamine, cystine, serine 
threonine, tyrosine. Phenylalanine, tryptophan, and 
tyrosine are sometimes classified as aromatic amino 
acids. 

An "antigen" refers to a molecule containing one or 
more epitopes (either linear, conformational or both) 
that will stimulate a host's immune system to make a 
humoral and/or cellular antigen-specific response. The 
term is used interchangeably with the term "immunogen. » 
Normally, a B-cell epitope will include at least about 5 
amino acids but can be as small as 3-4 amino acids. A T- 
cell epitope, such as a CTL epitope, will include at 
least about 7-9 amino acids, and a helper T-cell epitope 
at least about 12-20 amino acids. Normally, an epitope 
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will include between about 7 and 15 amino acids, such as, 
9, 10, 12 or 15 amino acids. The term "antigen" denotes 
both subunit antigens, (i.e., antigens which are separate 
and discrete from a whole organism with which the antigen 
5 is associated in nature), as well as, killed, attenuated 
of inactivated bacteria, viruses, fungi, parasites or 
other microbes. Antibodies such as anti-idiotype 
antibodies, or fragments thereof, and synthetic peptide 
mimotopes, which can mimic an antigen or antigenic 
determinant, are also captured under the definition of 
antigen as used herein. Similarly, an oligonucleotide or 
polynucleotide which expresses an antigen or antigenic 
determinant in vivo, such as in gene therapy and DNA 
immunization applications, is also included in the 
15 definition of antigen herein. 

For purposes of the present invention, antigens can 
be derived from any of several known viruses, bacteria, 
parasites and fungi, as described more fully. below. The 
term also intends any of the various tumor antigens. 
Furthermore, for purposes of the present invention, an 
"antigen" refers to a protein which includes 
modifications, such as deletions, additions and 
substitutions (generally conservative in nature) , to the 
native sequence, so long as the protein maintains the 
ability to elicit an immunological response, as defined 
herein. These modif icatibns may be deliberate, as 
through site-directed mutagenesis, or may be accidental, 
such as through mutations of hosts which produce the 
antigens . 

An "immunological response" to an antigen or 
composition is the development in a subject of a humoral 
and/or a cellular immune response to an antigen present 
in the composition of interest. For purposes of the 
present invention, a "humoral immune response- refers to 
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an immune response mediated by antibody molecules, while 
a "cellular immune response" is one mediated by T- 
lymphocytes and/or other white blood cells. One 
^ important aspect of cellular immunity involves an 
> antigen-specific response by cytolytic T-cells ("CTL"s). 

CTLs have specificity for peptide antigens that are 
^ presented in association with proteins encoded by the 

major histocompatibility complex (MHC) and expressed on 
■ the surfaces of cells. CTLs help induce and promote the' 
destruction of intracellular microbes, or the lysis of 
cells infected with such microbes. Another aspect of 
cellular immunity involves an antigen-specific response 
by helper T-cells. Helper T-cells act to help stimulate 
the function, and focus the activity of, nonspecific 
effector cells against cells displaying peptide antigens 
in association with MHC molecules on their surface. A 
"cellular immune response" also refers to the production 
of cytokines, chemokines and other such molecules 
produced by activated T-cells and/or other white blood 
cells, including those derived from CD4+ and CD8+ T- 
cells . 

A composition or vaccine that elicits a cellular 
immune response may serve to sensitize a vertebrate 
subject by the presentation of antigen in association 
25 with MHC molecules at the cell surface. The cell- 

mediated immune response is directed at, or near, cells 
presenting antigen at their surface. In addition, 
antigen-specific T- lymphocytes can be generated to allow 
for the future protection of an immunized host. 

The ability of a particular antigen to stimulate a 
cell -mediated immunological response may be determined by 
a number of assays, such as by lymphoprolif eration 
(lymphocyte activation) assays, CTL cytotoxic cell 
assays, or by assaying for T- lymphocytes specific for the 
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antigen in a sensitized subject. Such assays are well 
known in the art. See, e.g., Erickson et al . , j. 
Immunol. (1993) 151:4189-4199; Doe et al . , Eur. J. 
Immunol. (1994) 21:2369-2376. Recent methods of 
5 measuring cell -mediated immune response include 

measurement of intracellular cytokines or cytokine 
secretion by T-cell populations, or by measurement of 
epitope specific T-cells (e.g., by the tetramer 
technique) (reviewed by McMichael, A.J., and O'Callaghan, 
C.A., J. Exp. Med. 187(9)1367-1371, 1998; Mcheyzer- 
Williams, M.G., et al , Immunol. Rev. 150:5-21, 1996; 
Lalvani, A., et al, J. Exp. Med. 186:859-865, 1997). 

Thus, an immunological response as used herein may 
be one which stimulates the production of CTLs, and/or 
the production or activation of helper T- cells. The 
antigen of interest may also elicit an antibody-mediated 
immune response. Hence, an immunological response may 
include one or more of the following effects: the 
production of antibodies by B-cells; and/or the 
activation of suppressor T-cells and/or yd T-cells 
directed specifically to an antigen or antigens present 
in the composition or vaccine of interest. These 
responses may serve to neutralize infectivity, and/or 
mediate antibody-complement, or antibody dependent cell 
cytotoxicity (ADCC) to provide protection to an immunized 
host. Such responses can be determined using standard 
immunoassays and neutralization assays, well known in the 
art . 

An "immunogenic composition- is a composition that 
comprises an antigenic molecule where administration of 
the composition to a subject results in the development 
in the subject of a humoral and/or a cellular immune 
response to the antigenic molecule of interest. 



26 



WO 00/39302 



PCT/US99/31245 



By "subunit vaccine" is meant a vaccine composition 
which includes one or more selected antigens but not all 
antigens, derived from or homologous to, an antigen from 
a pathogen of interest such as from a virus, bacterium, 
5 parasite or fungus. Such a composition is substantially 
free of intact pathogen cells or pathogenic particles, or 
the lysate of. such cells or particles. Thus, a "subunit 
vaccine" can be prepared from at least partially purified 
(preferably substantially purified) immunogenic 
10 polypeptides from the pathogen, or analogs thereof. The 
method of obtaining an antigen included in the subunit 
vaccine can thus include standard purification 
techniques, recombinant production, or synthetic 
production. 

15 "Substantially purified" general refers to isolation 

of a substance (compound, polynucleotide, protein, 
polypeptide, polypeptide composition) such that the 
substance comprises the majority percent of the sample in 
which it resides. Typically in a sample a substantially 
20 purified component comprises 50%, preferably 80%-85%, 
more preferably 90-95% of the sample. Techniques for 
purifying polynucleotides and polypeptides of interest 
are well-known in the art and include, for example, ion- 
exchange chromatography, affinity chromatography and 
25 sedimentation according to density. 

A "coding sequence" or a sequence which "encodes" a 
selected polypeptide, is a nucleic acid molecule which is 
transcribed, (in the case of DNA) and translated (in the 
case of mRNA) into a polypeptide in vivo when placed 
under the control of appropriate regulatory sequences (or 
"control elements"). The boundaries of the coding 
sequence are determined by a start codon at the 5» 
(amino) terminus and a translation stop codon at the 3 « 
(carboxy) terminus. A coding sequence can include, but 
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is not limited to, cDNA from viral, procaryotic or 
eucaryotic mRNA, genomic DNA sequences from viral or 
procaryotic DNA, and even synthetic DNA sequences. A 
transcription termination sequence may be located 3- to 
5 the coding sequence. 

Typical "control elements", include, but are not 
limited to, transcription promoters, transcription 
enhancer elements, transcription termination signals, 
polyadenylation sequences (located 3 ■ to the translation 
stop codon), sequences- for optimization of initiation of 
translation (located 5' to the coding sequence), and 
translation termination sequences, see e.g., McCaughan et 
al. (1995) PNAS USA 92:5431-5435; Kochetov et al (1998) 
FEBS Letts. 440:351-355. 

A "nucleic acid- molecule can include, but is not 
limited to, procaryotic sequences, eucaryotic mRNA, cDNA 
from eucaryotic mRNA, genomic DNA sequences from 
eucaryotic (e.g., mammalian) DNA, and even synthetic DNA 
sequences. The term also captures sequences that include 
any of the known base analogs of DNA and RNA. 

"Operably linked" refers to an arrangement of 
elements wherein the components so described are 
configured so as to perform their usual function. Thus 
a given promoter operably linked to a coding sequence is 
capable of effecting the expression of the coding 
sequence when the proper enzymes are present. The 
promoter need not be contiguous with the coding sequence 
so long as it functions to direct the expression thereof' 
Thus, for example, intervening untranslated yet 
transcribed sequences can be present between the promoter 
sequence and the coding sequence and the promoter 
sequence can still be considered "operably linked" to the 
coding sequence. 
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"Recombinant" as used herein to describe a nucleic 
acid molecule means a polynucleotide of genomic, cDNA, 
semisynthetic, or synthetic origin which, by virtue of 
its origin or manipulation: (l) is not associated with 
all or a portion of the polynucleotide with which it is 
associated in nature; and/or (2) is linked to a 
polynucleotide other than that to which it is linked in 
nature. The term "recombinant" as used with respect to a 
protein or polypeptide means a polypeptide produced by 
expression of a recombinant polynucleotide. "Recombinant 
host cells," "host cells," "cells, "cell lines," "cell, 
cultures," and other such terms denoting procaryotic 
microorganisms or eucaryotic cell lines cultured as 
unicellular entities, are used interchangeably, and refer 
to cells which can be, or have been, used as recipients 
for recombinant vectors or other transfer DNA, and 
include the progeny of the original cell which has been 
transfected. It is understood that the progeny of a 
single parental cell may not necessarily be completely 
identical in morphology or in genomic or total DNA 
complement to the original parent, due to accidental or 
deliberate mutation. Progeny of the parental cell which 
are sufficiently similar to the parent to be 
characterized by the relevant property, such as the 
presence of a nucleotide sequence encoding a desired 
peptide, are included in the progeny intended by this 
definition, and are covered by- the above terms. 

Techniques for determining amino acid sequence 
"similarity" are well known in the art. In general, 
"similarity" means the exact amino acid to amino acid 
comparison of two or more polypeptides at the appropriate 
place, where amino acids are identical or possess similar 
chemical and/or physical properties such as charge or 
hydrophobicity. A so- termed "percent similarity" then 
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can be determined between the compared polypeptide 
sequences. Techniques for determining nucleic acid and 
amino acid sequence identity also are well known in the 
art and include determining the nucleotide sequence of 
5 the mRNA for that gene (usually via a cDNA intermediate) 
and determining the amino acid sequence encoded thereby, 
and comparing this to a second amino acid sequence. in' 
general, "identity" refers to an exact nucleotide to 
nucleotide or amino acid to amino acid correspondence of 
' two polynucleotides or polypeptide sequences, 
respectively. 

Two or more polynucleotide sequences can be compared 
by determining their "percent identity." Two or more 
amino acid sequences likewise can be compared by 
determining their "percent identity." The percent 
identity of two sequences, whether nucleic acid or 
peptide sequences, is generally described as the number 
of exact matches between two aligned sequences divided by 
the length of the shorter sequence and multiplied by 100. 
An approximate alignment for nucleic acid sequences is 
provided by the local homology algorithm of Smith and 
Waterman, Advances in Applied Mathematics 2:482-489 
(1981) . This algorithm can be extended to use with 
peptide sequences using the scoring matrix developed by 
Dayhoff, Atlas of Protein Sequences and Structure, M.O. 
Dayhoff ed., 5 suppl . 3:353-358, National Biomedical ■ 
Research Foundation, Washington, D.C., USA, and 
normalized by Gribskov, Nucl . Acids Res . 14 (6 ): 674 5- 6763 
(1986) . An implementation of this algorithm for nucleic 
acid and peptide sequences is provided by the Genetics 
Computer Group (Madison, WI) in their BestFit utility 
application. The default parameters for this method are 
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described in the Wisconsin Sequence Analysis Package 
Program Manual, Version 8 (1995) (available from Genetics 
Computer Group, Madison, WI) . Other equally suitable 
programs for calculating the percent identity or 
5 similarity between sequences are generally known in the 
art . 

For example, percent identity of a particular 
nucleotide sequence to a reference sequence can be 
determined using the homology algorithm of Smith and 
10 Waterman with a default scoring table and a gap penalty 
of six nucleotide positions. Another method of 
establishing percent identity in the context of the 
present invention is to use the MPSRCH package of 
programs copyrighted by the University of Edinburgh, 
15 developed by John F. Collins and Shane S. Sturrok, and 

distributed by IntelliGenetics, Inc. (Mountain View, CA) . 
From this suite of packages, the Smith -Waterman algorithm 
can be employed where default parameters are used for the 
scoring table (for example, gap open penalty of 12, gap 
extension penalty of one, and a gap of six) . From the 
data generated, the "Match" value reflects "sequence 
identity." Other suitable programs for calculating the 
percent identity or similarity between sequences are 
generally known in the art, such as the alignment program 
25 BLAST, which can also be used with default parameters. 
For example, BLASTN and BLASTP can be used with the 
following default parameters: genetic code = standard; 
filter = none; strand = both; cutoff = 60; expect = 10; 
Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = 
30 HIGH SCORE; Databases = non - redundant , GenBank + EMBL + 
DDBJ + PDB + GenBank CDS translations + Swiss protein + 
Spupdate + PIR. Details of these programs can be found at 
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the following internet address: 

http: / /www. ncbi .nlm.gov/cgi-bin/BIAST. 

One of skill in the art can readily determine the 
proper search parameters to use for a given sequence in 
■> the above programs. For example, the search parameters 
may vary based on the size of the sequence in question. 
Thus, for example, a representative embodiment of the 
present invention would include an isolated 
polynucleotide having X contiguous nucleotides, wherein 
(i) the X contiguous nucleotides have at least about 50% 
identity to Y contiguous nucleotides derived from any of 
■ the sequences described herein, (ii) x equals Y, and 
(iii) X is greater than or equal to 6 nucleotides and up 
to 5000 nucleotides, preferably greater than or equal to 
8 nucleotides and up to 5000 nucleotides, more preferably 
10-12 nucleotides and up to 5000 nucleotides, and even 
more preferably 15-20 nucleotides, up to the number of 
nucleotides present in the full-length sequences 
described herein (e.g., see the Sequence Listing and 
claims) , including all integer values falling within the 
above -described ranges. 

The synthetic expression cassettes (and purified 
polynucleotides) of the present invention include related 
polynucleotide sequences having about 80% to 100%, 
greater than 80-85%, preferably greater than 90-92%, more 
preferably greater than 95%, and most preferably greater 
than 98% sequence (including all integer values falling 
within these described ranges) identity to the synthetic 
expression cassette sequences disclosed herein (for 
example, to the sequences presented in Tables 1A and IB) 
when the sequences of the present invention are used as 
the query sequence. 
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Two nucleic acid fragments are considered to 
"selectively hybridize" as described herein. The degree 
of sequence identity between two nucleic acid molecules 
affects the efficiency and strength of hybridization 
i events between such molecules. A partially identical 
nucleic acid sequence will at least partially inhibit a 
completely identical sequence from hybridizing to a 
target molecule. Inhibition of hybridization of the 
completely identical sequence can be assessed using 
hybridization assays that are well known in the art 
(e.g., Southern blot. Northern blot, solution 
hybridization, or the like, see Sambrook, et al . , 
molecular Cloning: A Laboratory Manual, Second Edition 
(1989) Cold Spring Harbor, N.Y.). Such assays can fae 
conducted using varying degrees of selectivity, for 
example, using conditions varying from low to high 
stringency. If conditions of low stringency are 
employed, the absence of non-specific binding can be 
assessed using a secondary probe that, lacks even a 
partial degree of sequence identity (for example, a probe 
having less than about 30% sequence identity with the 
target molecule) , such that, in the absence of non- 
specific binding events, the secondary probe will not 
hybridize to the target. 

When utilizing a hybridization-based detection 
system, a nucleic acid probe is chosen that is 
complementary to a target nucleic acid sequence, and then 
by selection of appropriate conditions the probe and the 
target sequence "selectively hybridize," or bind, to each 
other to form a hybrid molecule. A nucleic acid molecule 
that is capable of hybridizing selectively to a target 
sequence under "moderately stringent" typically 
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hybridizes under conditions that allow detection of a 
target nucleic acid sequence of at least about 10-14 
nucleotides in length having at least approximately 70% 
sequence identity with the sequence of the selected 
nucleic acid probe. Stringent hybridization conditions 
typically allow detection of target nucleic acid 
sequences of at least about 10-14 nucleotides in length 
having a sequence identity of greater than about 90-95% 
with the sequence of the selected nucleic acid probe 
Hybridization conditions useful for probe/target 
hybridization where the probe and target have a specific 
degree of sequence identity, can be determined as is 
known in the art (see, for example, Nucleic a^h 
Hybridist-ion- A PrnrMrw1 , P r m,r h „ editors B.D. Hames 
and S.J. Higgins, (198 5) Oxford; Washington, DC; IRL 
Press) . 

With respect to stringency conditions for 
hybridization, it is well known in the art that numerous 
equivalent conditions can be employed to establish a 
particular stringency by varying, for example, the 
following factors: the length and nature of probe and 
target sequences, base composition of the various 
sequences, concentrations of salts and other 
hybridization solution components, the presence or 
absence of blocking agents in the hybridization solutions 
(e.g., formamide, dextran sulfate, and polyethylene 
9lycol), hybridization reaction temperature and time 
parameters, as well as, varying wash conditions. The 
selection of a particular set of hybridization conditions 
18 Selected Allowing standard methods in the art (see 
for example, Sambrook, et al . , Molecule n^^ . . 
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Laboratory Manual, Second Edition, (1989) Cold Spring 
Harbor, N.Y. ) . 

A first polynucleotide is "derived from" second 
polynucleotide if it has the same or substantially the 
5 same basepair sequence as a region of the second 

polynucleotide, its cDNA, complements thereof, or if it 
displays sequence identity as described above. 

A first polypeptide is "derived from" a second 
polypeptide if it is (i) encoded by a first 
10 polynucleotide derived from a second polynucleotide, or 
(ii) displays sequence identity to the second 
polypeptides as described above. 

Generally, a viral polypeptide is "derived from" a 
particular polypeptide of a virus (viral polypeptide) if 
15 it is (i) encoded by an open reading frame of a 

polynucleotide of that virus (viral polynucleotide) , or 
(ii) displays sequence identity to polypeptides of that 
virus as described above. 

"Encoded by" refers to a nucleic acid sequence which 
codes for a polypeptide sequence, wherein the polypeptide 
sequence or a portion thereof contains an amino acid 
sequence of at least 3 to 5 amino acids, more preferably 
at least 8 to 10 amino acids, and even more preferably at 
least 15 to 20 amino acids from a polypeptide encoded by 
25, the nucleic acid sequence. Also encompassed are 
polypeptide sequences which are immunologically 
identifiable with a polypeptide encoded by the sequence. 

"Purified polynucleotide" refers to a polynucleotide 
of interest or fragment thereof which is essentially 
free, e.g., contains less than about 50%, preferably less 
than about 70%, and more preferably less than about 90%, 
of the protein with which the polynucleotide is naturally 
associated. Techniques for purifying polynucleotides of 
interest are well-known in the art and include, for 
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example, disruption of the cell containing the 
polynucleotide with a chaotropic agent and separation of 
the polynucleotide (s) and proteins by ion-exchange 
chromatography, affinity chromatography and sedimentation 
5 according to density. 

By "nucleic acid immunization" is meant the 
introduction of a nucleic acid molecule encoding one or 
more selected antigens into a host cell, for the in vivo 
expression of an antigen, antigens, an epitope, or 
10 epitopes. The nucleic acid molecule can be introduced 
directly into a recipient subject, such as by injection, 
inhalation, oral, intranasal and mucosal administration, 
or the like, or can be introduced ex vivo, into cells 
which have been removed from the host. In the latter 
15 case, the transformed cells are reintroduced into the 

subject where an immune response can be mounted against 
the antigen encoded by the nucleic acid molecule. 

"Gene transfer" or "gene delivery" refers to methods 
or systems for reliably inserting DNA or RNA of interest 
20 into a host cell. Such methods can result in transient 
expression of non-integrated transferred DNA, 
extrachromosomal replication and expression of 
transferred replicons (e.g., episomes) , or integration of 
transferred genetic material into the genomic DNA of host 
cells. Gene delivery expression vectors include, but are 
not limited to, vectors derived from bacterial plasmid 
vectors, viral vectors, non-viral vectors, alphaviruses, 
pox viruses and vaccinia viruses. When used for 
immunization, such gene delivery expression vectors may 
be referred to as vaccines or vaccine vectors. 

M T lymphocytes" or «t cells" are non-antibody 
producing lymphocytes that constitute a part of the cell- 
mediated arm of the immune system. T cells arise from 
immature lymphocytes that migrate from the bone marrow to 
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the thymus, where they undergo a maturation process under 
- the direction of thymic hormones. Here, the mature 
lymphocytes rapidly divide increasing to very large 
:, numbers. The maturing T cells become immunocompetent 
5 based on their ability to recognize and bind a specific 
antigen. Activation of immunocompetent T cells is 
triggered when an antigen binds to the lymphocyte's 
surface receptors. 

The term M transf ection" is used to refer to the 
10 uptake of foreign DNA by a cell. A cell has been 

" transf ected" when exogenous DNA has been introduced 
inside the cell membrane. A number of transf ection 
techniques are generally known in the art. See, e.g., 
Graham et al . (1973) Virology, 52:456, Sambrook et al . 
15 (1989) Molecular Cloning, a laboratory manual, Cold 

Spring Harbor Laboratories, New York, Davis et al . (1986) 
Basic Methods in Molecular Biology, Elsevier, and Chu et 
al. (1981) Gene 13:197. Such techniques can be used to 
introduce one or more exogenous DNA moieties into 
20 suitable host cells. The term refers to both stable and 
transient uptake of the genetic material, and includes 
uptake of peptide- or antibody-linked DNAs . 

A "vector" is capable of transferring gene sequences 
to target cells .(e.g., bacterial plasmid vectors, viral 
25 vectors, non-viral vectors, particulate carriers, and 
■liposomes). Typically, "vector construct," "expression 
vector," and "gene transfer vector," mean any nucleic 
acid construct capable of directing the expression of a 
gene of interest and which can transfer gene sequences to 
30 target cells. Thus, the term includes cloning and 
expression vehicles, as well as viral vectors. 

Transfer of a "suicide gene" (e.g., a drug- 
susceptibility gene) to a target cell renders the cell 
sensitive to compounds or compositions that are 
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relatively nontoxic to normal cells. Moolten, F.L. 
(1994) Cancer Gene Ther. 1:279-287. Examples of suicide 
genes are thymidine kinase of herpes simplex virus (HSV- 
tk) , cytochrome P450 (Manome et al . (1996) Gene Therapy 
3:513-520), human deoxycytidine kinase (Manome et al . 
(1996) Mature Medicine 2 (5) : 567-573 ) and the bacterial 
enzyme cytosine deaminase (Dong et al . (1996) Human Gene 
Therapy 7:713-720) . Cells which express these genes are 
rendered sensitive to the effects of the relatively 
nontoxic prodrugs ganciclovir (HSV-tk) , cyclophosphamide 
(cytochrome P450 2B1) , cytosine arabinoside (human 
deoxycytidine kinase) or 5-f lubrocytosine (bacterial 
cytosine deaminase). Culver et al . (1992) Science 
256:1550-1552, Huber et al . (1994) Proc . Natl. Acad. Sci . 
15 USA .91:8302-8306. 

A "selectable marker" or "reporter marker" refers to 
a nucleotide sequence included in a gene transfer vector 
that has no therapeutic activity, but rather is included 
to allow for simpler preparation, manufacturing, 
characterization or testing of the gene transfer vector. 

A "specific binding agent" refers to a member of a 
specific binding pair of molecules wherein one of the 
molecules specifically binds to the second molecule 
through chemical and/or physical means. .One example of a 
specific binding agent is an antibody directed against a 
selected antigen. 

By "subject" is meant any member of the subphylum 
chordata, including, without limitation, humans and other 
primates, including non-human primates such as 
chimpanzees and other apes and monkey species; farm 
animals such as cattle, sheep, pigs, goats and horses; 
domestic mammals such as dogs and cats; laboratory 
animals including rodents such as mice, rats and guinea 
pigs; birds, including domestic, wild and game birds such 
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as chickens, turkeys and other gallinaceous birds, ducks, 
geese , and. the like . The term does not denote a 
particular age. Thus, both adult and newborn individuals 
are intended to be covered. The system described above 
> is intended for use in any of the above vertebrate 
species, since the immune systems of all of these 
vertebrates operate similarly. 

By "pharmaceutically acceptable" or 
"pharmacologically acceptable" is meant a material which 
is not biologically or otherwise undesirable, i.e., the 
material may be administered to an individual in a 
formulation or composition without causing any 
undesirable biological effects or interacting in a 
deleterious manner with any of the components of the. 
15 composition in which it is contained. 

By "physiological pH» or a "pH in the physiological 
range" is meant a pH in the range of approximately 7.2 to 
8.0 inclusive, more typically in the range of 
approximately 7.2 to 7.6 inclusive. 

As used herein, "treatment" refers to any of (I) the 
prevention of infection or reinfection, as in a 
traditional vaccine, (ii) the reduction or elimination of 
symptoms, and (iii) the substantial or complete 
elimination of the pathogen in question. Treatment may 
be effected prophylactically (prior to infection) or 
therapeutically (following infection) . 

"Lentiviral vector", and "recombinant lentiviral 
vector" are derived from the subset of retroviral vectors 
known as lentiviruses . Lentiviral vectors refer to a 
nucleic acid construct which carries, and within certain 
embodiments, is capable of directing the expression of a 
nucleic acid molecule of interest. The lentiviral vector 
includes at least one transcriptional promoter/enhancer 
or locus defining element (s) , or other elements which 
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control gene expression by other means such as alternate 
splicing, nuclear RNA export, post-translational 
modification of messenger, or post -transcriptional 
modification of protein. Such vector constructs must 
> also include a packaging signal, long terminal repeats 
(LTRS) or portion thereof, and positive and negative 
strand primer binding sites appropriate to the lentiviral 
vector used (if these are not already present in the 
retroviral vector) . Optionally, the recombinant 
lentiviral vector may also include a signal which directs 
polyadenylation, selectable markers such as Neo, TK, 
hygromycin, phleomycin, histidinol, or DHFR, as well as 
one or more restriction sites and a translation 
termination sequence. By way of example, such vectors 
typically include a 5- LTR, a tRNA binding site, a 
packaging signal, an origin of second strand DNA 
synthesis, and a 3 'LTR or a portion thereof. 

"Lentiviral vector particle- as utilized within the 
present invention refers to a lentivirus which carries at 
least one gene of interest. The retrovirus may also 
contain a selectable marker. The recombinant lentivirus 
is capable of reverse transcribing its genetic material 
(RNA) into DNA and incorporating this genetic material 
into a host cell's DNA upon infection. Lentiviral vector 
particles may have a lentiviral envelope, a non- 
lentiviral envelope (e.g., an ampho or VSV-G envelope), 
or a chimeric envelope. 

"Nucleic acid expression vector" or "Expression 
cassette" refers to an assembly which is capable of 
directing the expression of a sequence or gene of 
interest. The nucleic acid expression vector includes a 
promoter which is operably linked to the sequences or 
gene(s) of interest. Other control elements may be 
present as well. Expression cassettes described herein 
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may be contained within a plasmid construct. In addition 
to the components of the expression cassette, the plasmid 
construct may also include a bacterial origin of 
replication, one or more selectable markers, a signal 
which allows the plasmid construct to exist as single- 
stranded DNA re.gr., a M13 origin of replication), a 
multiple cloning site, and a "mammalian" origin of 
replication (e.g., a SV40 or adenovirus origin of 
replication) . 

"Packaging cell" refers to a cell which contains 
those elements necessary for production of infectious 
recombinant retrovirus (e.g., lentivirus) which are 
lacking in a recombinant retroviral vector. Typically, 
such packaging cells contain one or more expression 
cassettes which are capable of expressing proteins which 
encode Gag, pol and env proteins. 

"Producer cell" or "vector producing cell" refers to 
a cell which contains all elements necessary for 
production of recombinant retroviral vector particles. 



2 . Modes of Carrying Out the Invention 

Before describing the present invention in detail, 
it is to be understood that this invention is not limited 
to particular formulations or process parameters as such 
25 may, of course, vary. It is also to be understood that 
the terminology used herein is for the purpose of 
describing particular embodiments of the invention only, 
and is not intended to be limiting. 

Although a number of methods and materials similar 
or equivalent to those described herein can be used in 
the practice of the present invention, the preferred 
materials and methods are described herein. 
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2 . 1 Synthetic Expression Cassettes 

2.1.1 Modification op HIV-1 Gag Nucleic Acid Coding 

Sequences 

One aspect of the present invention is the 
generation of HIV-1 Gag protein coding sequences, and 
related sequences, having improved expression relative to 
the corresponding wild-type sequence. An exemplary 
embodiment of the present invention is illustrated herein 
modifying the Gag protein wild-type sequences obtained 
from the HIV-1SF2 strain (SEQ ID NO:l; Sanchez-Pescador, 
R., et al., Science 227(4686) : 484-492, 1985; Luciw, 
P. A., et al. U.S. Patent No. 5,156,949, issued October 
20, 1992; Luciw, P. A. , et al . , U.S. Patent No. 5, 688,688, 
November 18, 1997) . Gag sequence obtained from other Hiv' 
variants may be manipulated in similar fashion following 
the teachings of the present specification. Such other 
variants include, but are not limited to, Gag protein 
encoding sequences obtained from the isolates HIV IIIb , 
HIV SF2 , HIV- 

20 1 SF162 , HIV-1 SF170 , HI Vlav , HI Vlm , HI Vmn , HIV-1^,, HIV-1 US4 , 
other HIV-l strains from diverse subtypes (e.g., 
subtypes, A through G, and O) , HIV- 2 strains and diverse 
subtypes (e.g., HIV-2 UCI and HIV-2^) , and simian 
immunodeficiency virus (SIV) . (See, e.g., Virology, 3rd 
Edition (W.K. Joklik ed. 1988); Fundamental Virology, 2nd 
Edition (B.N. Fields and D.M. Knipe, eds. 1991); 
Virology, 3rd Edition (Fields, BN, DM Knipe, PM Howley, 
Editors, 1996, Lippincott -Raven, Philadelphia, PA; for a 
description of these and other related viruses). 

First, the HIV-1 codon usage pattern was modified so 
that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes (Example 1). The HIV codon usage reflects a high 
content of the nucleotides A or T of the codon- triplet . 
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The effect of the HIV-i codon usage is a high AT content 
in the DNA sequence that results in a decreased 
translation ability and instability of the mRNA. In 
comparison, highly expressed human codons prefer the 
nucleotides G or C. The Gag coding sequences were 
modified to be comparable to codon usage found in highly 
expressed human genes. In Figure 11 (Example 1), the 
percent A-T content of cDNA sequences corresponding to 
the mRNA for a known unstable mRNA and a known stable 
mRNA are compared to the percent A-T content of native 
HIV-1SF2 Gag cDNA and to the synthetic Gag cDNA sequence 
of the present invention. Experiments performed in 
support of the present invention showed that the 
synthetic Gag sequences were capable of higher level of 
15 protein production (see the Examples) relative to the 

native Gag sequences. The data in Figure 11 suggest that 
one reason for this increased production is increased 
stability of the mRNA corresponding to the synthetic Gag 
coding sequences versus the mRNA corresponding to the 
20 native Gag coding sequences. 

Second, there are inhibitory (or instability) 
elements (INS) located within the coding sequences of the 
Gag coding sequences (Example 1). The RRE is a secondary 
RNA structure that interacts with the HIV encoded Rev- 
protein to overcome the expression down- regulating 
effects of the INS. To overcome the post - transcriptional 
activating mechanisms of RRE and Rev, the instability 
elements were inactivated by introducing multiple point 
mutations that did not alter the reading frame of the 
encoded proteins. Figure 1 shows the original SF2 Gag 
sequence, the location of the INS sequences, and the 
modifications made to the INS sequences to reduce their 
effects. The resulting modified coding sequences are 
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presented as a synthetic Gag expression cassette (SEQ ID 
NO: 4) . 

Modification of the Gag polypeptide coding sequences 
resulted in improved expression relative to the wild-type 
coding sequences in a number of mammalian cell lines (as 
well as other types of cell lines, including, but not 
limited to, insect cells) . Further, expression of the 
sequences resulted in production of virus-like particles 
(VLPs) by these cell lines (see below) . Similar Gag 
polypeptide coding sequences can be obtained from a 
variety of isolates (families, sub-types, strains, etc.) 
including, but not limited to such other variants 
include, but are not limited to, Gag polypeptide encoding 
sequences obtained from the isolates HIV tllb , HIV SF2 , HIV- 
15 l SPl62 , HIV-1 SP170 , HI Vlav< HI Vlai , HIV^, HIV-l^,,, HIV-1 US4 , 

other HIV-1 strains from diverse subtypes (e . g . , subtypes, 
A through G, and O) , HIV-2 strains and diverse subtypes 
(e.g., HIV-2 UC1 and HIV-2 UC2 ) , and simian immunodeficiency 
virus (SIV). (See, e.g., Virology, 3rd Edition (W.K. 
Joklik ed. 1988); Fundamental Virology, 2nd Edition (B.N. 
Fields and D.M. Knipe, eds. 1991; Virology, 3rd Edition 
(Fields, BN, DM Knipe, PM Howley, Editors, 1996, 
Lippincott-Raven, Philadelphia, PA) . Gag polypeptide 
encoding sequences derived from these variants can be 
optimized and tested for improved expression in mammals 
by following the teachings of the present specification 
(see the Examples, in particular Example 1) . 



20 



25 



30 



2.1.2 Further Modification of Sequences Including HIV-1 

Gag Nucleic Acid Coding Sequences 

Experiments performed in support of the present 
invention have shown that similar modifications of HIV-l 
Gag-protease, Gag-reverse transcriptase and Gag- 
polymerase sequences also result in improved expression 
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of the polyproteins, as well as, the production of VLPs 
formed by polypeptides produced from such modified coding 
sequences . 

For the Gag-protease sequence (wild type, SEQ ID 
5 NO:2; modified, SEQ ID NOs:5, 78, 79), the changes in 
codon usage were restricted to the regions upstream of 
the -1 frameshift (Figure 2). Further, inhibitory (or 
instability) elements (INS) located within the coding 
sequences of the Gag-protease polypeptide coding sequence 
10 were altered as well (indicated in Figure 2). Exemplary 
constructs (which include the -1 frameshift) encoding 
modified Gag-protease sequences include those shown in 
SEQ ID NOs:78 and 79 (Figures 69 and 70) . These are: GP1 
(SEQ ID NO: 78) in which the protease region was also 
15 codon optimized and INS inactivated and GP2 (SEQ ID 

NO: 79) , in which the protease region was only subjected 
to INS inactivation. 

For other Gag-containing sequences, for example the 
Gag-polymerase sequence (wild type, SEQ ID NO: 3; 
20 modified, SEQ ID NO: 6) or Gag-reverse transcriptase (wild 
type, SEQ ID NO:77; modified SEQ ID NOs:80-84), the 
changes in codon usage are similar to those for the Gag- 
protease sequence. Those expression cassettes which 
contain a frameshift in the GagPol coding sequence are 
25 designated »FS( + )'' (SEQ ID NOs:80 and 81, Figures 71 and 
72) while the designation «FS(-)» (SEQ ID Nos: 82, 83 and 
84, Figures 73, 74 and 75) indicates that there is no 
frameshift utilized in this coding sequence. 

In addition to polyproteins containing HIV-related 
30 sequences, the various Gag-, Gag-prot, Gag-pol, Gag- 
reverse transcriptase encoding sequences of the present 
invention can be fused to other polypeptides (creating 
chimeric polypeptides) for which an immunogenic response 
is desired. An example of such a chimeric protein is the 
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joining of the improved expression Gag encoding sequences 
to the Hepatitis C Virus (HCV) core protein, m this 
case, the HCV-core encoding sequences were placed in- 
frame with the HIV-Gag encoding sequences, resulting in 
• the Gag/HCV-core encoding sequence presented as SEQ ID 
NO: 7 (wild type sequence presented as SEQ ID NO: 8). 

Further sequences useful in the practice of the 
present invention include, but are not limited to 
sequences encoding viral epitopes/antigens {including but 
not limited to, HGV antigens (e.g., El, E2 , Houghton, 
M.., et al., U.S. Patent No. 5,714,596, issued February 
3, 1998; Houghton, M. . , et al . , U.S. Patent No. 
5,712,088, issued January 27, i 998 ; Houghton, M. . , et 
al-, U.S. Patent No. 5,683,864, issued November A 1997 
Weiner, A.J., at al . , u>s . Patent ^ '.^ 

March 17, 1998; Weiner, A.J. , et al . , U.S. Patent No. 
5,766,845, issued June 16, 1998; Weiner, A.J., et al 
U.S. Patent No. 5,670,152, issued September 23,' 1997) " 
HIV antigens (e.g., derived from nef, tat, rev, vpu, vif 
vpr and/or env) ; and sequences encoding tumor 
antigens/epitopes. Additional sequences are described 
below. Also, variations on the orientation of the Gag 
and other coding sequences, relative to each other, are 
also described below. 

Gag, Gag-protease,. Gag-reverse transcriptase and/or 

Gag-polymerase polypeptide coding sequences can be 

obtained from any HIV isolates (different families 

subtypes, and strains) including but not limited to the 

isolates HIV 11Ib , HIV SF2 , HIV SF16J , H IVus4 , HIV cm235 Hlv 

30 HIV LAI , HIV MN ) (see, e.g., Myers et al r n , 

9 ' yers et al - Los Alamos Database, 
Los Alamos National Laboratory, Los Alamos , New Mexico 

U992); Myers et al . , Human Retroviruses and Aids, 1997 
Los Alamos, New Mexico: Los Alamos National Laboratory) ' 
Synthetic expression cassettes can be generated using 
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such coding sequences as starting material by following 
the teachings of the present specification (e.g., see 
Example 1) . Further, the synthetic expression cassettes 
of the present invention include related Gag polypeptide 
5 coding sequences having greater than 75%, preferably 

greater than 80-85%, more preferably greater than 90-95%, 
and most preferably greater than 98% sequence identity 
(or any integer value within these ranges) to the 
synthetic expression cassette sequences disclosed herein 
10 (for example, SEQ ID NO:4; SEQ ID NO:5; SEQ ID NO : 6 ; and 
SEQ ID NO: 20, the Gag Major Homology Region) . 

2.1.3 Expression of Synthetic Sequences Encoding HIV-1 

Gag and Related Polypeptides 

15 Several synthetic Gag-encoding sequences (expression 

cassettes) of the present invention were cloned into a 
number of different expression vectors (Example 1) to 
evaluate levels of expression and production of VLPs. 
Two modified synthetic coding sequences are presented as 

20 a 

synthetic Gag expression cassette (SEQ ID NO:4) and a 
synthetic Gag-protease expression cassette (SEQ ID NOs:78 
and 79) . Other synthetic Gag-encoding proteins are 
presented, for example, as SEQ ID NOs:80 through 84. The 
25 ^ synthetic DNA fragments for Gag-encoding polypeptides 
(e.g., Gag, Gag-protease, Gag-polymerase , Gag-reverse 
transcriptase) were cloned into expression vectors 
described in Example 1, including, a transient expression 
vector, CMV-promoter-based mammalian vectors, and a 
shuttle vector for use in baculovirus expression systems. 
Corresponding wild- type sequences were cloned into the 
same vectors. 

These vectors were then transfected into a several 
different cell types, including a variety of mammalian 
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cell lines, (293, RD, COS-7, and CHO, cell lines 
available, for example, from the A.T.C.C.). The cell 
lines were cultured under appropriate conditions and the 
levels of p24 (Gag) expression in supernatants were 
5 evaluated (Example 2) . The results of these assays 

demonstrated that expression of synthetic Gag-encoding 
sequences were significantly higher than corresponding 
wild- type sequences (Example 2; Table 2) . 

Further, Western Blot analysis showed that cells 
0 containing the synthetic Gag expression cassette produced 
the expected 55 kD (p55) protein at higher per-cell 
concentrations than cells containing the native 
expression cassette. The Gag p55 protein was seen in 
both cell lysates and supernatants. The. levels of 
5 production were significantly higher in cell supernatants 
for cells transfected with the synthetic Gag expression 
cassette of the present invention. Experiments performed 
in support of the present invention suggest that cells 
containing the synthetic Gag-prot expression cassettes 
) produced the expected Gag-prot protein at comparably 

higher per-cell concentrations than cells containing the 
wild-type expression cassette. 

Fractionation of the supernatants from mammalian 
cells transfected with the synthetic Gag expression 
>. cassette showed that it provides superior production of 
both p55 protein and VLPs, relative to the. wild-type Gag 
sequences (Examples 6 and 7) . 

Efficient expression of these Gag-containing 
polypeptides in mammalian cell lines provides the 
following benefits: the Gag polypeptides are free of 
baculovirus contaminants; production by established 
methods approved by the FDA; increased purity; greater 
yields (relative to native coding sequences) ; and a novel 
method of producing the Gag-containing polypeptides in 



48 



WO 00/39302 



PCI7US99/3I245 



CHO or other mammalian cells which is not feasible in the 

- absence of the increased expression obtained using the 

■ constructs of the present invention. Exemplary Mammalian 
. cell lines include, but are not limited to, BHK, VERO, 
5 - HT1080. 293, 293T,. RD, COS-7, CHO, Jurkat , HUT, SUPT, 
C8166, MOLT4/clone8, MT-2, MT-4, H9, PM1, CEM, myeloma 
t cells (e.g., SB20 cells) and CEMX174, such cell lines are 
available, for example, from the A.T.C.C.). 

A synthetic Gag expression cassette of the present 
10 -invention also demonstrated high levels of expression and 
VLP production when transfected into insect cells 
(Example 7). Further, in addition to a higher total 
protein yield, the final product from the synthetic p55- 
expressed Gag consistently contained lower amounts of 
15 contaminating baculovirus proteins than the final 
purified product from the native p55-expressed Gag. 

Further, synthetic Gag expression cassettes of the 
present invention have also been introduced into yeast 
vectors which were transformed into and' efficiently 
20 expressed by yeast cells ( Saccharomyces cerevisea; using 
vectors as described in Rosenberg, S. and Tekamp-Olson, 
P., U.S. Patent No. RE35,749, issued, March 17, 1998). 

In addition to the mammalian and insect vectors 
described in the Examples, the synthetic expression 
• cassettes of the present invention can be incorporated 
into a variety of expression vectors using selected 
expression control elements. Appropriate vectors and 
control elements for any given cell type can be selected 
by one having ordinary skill in the art in view of the 

- teachings of the present specification and information 
-known in the art about expression vectors. 

For example, a synthetic Gag expression cassette can 
be inserted into a vector which includes control elements 
operably linked to the desired coding sequence, which 
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allow for the expression of the gene in a selected cell- 
type. For example, typical promoters for mammalian cell 
expression include the SV40 early promoter, a CMV 
promoter such as the CMV immediate early promoter (a CMV 
promoter can include intron A) , RSV, HIV-LTR, the mouse 
mammary tumor virus LTR promoter (MMLV-LTR) , FIV-LTR, the 
adenovirus major late promoter (Ad MLP) , and the herpes 
simplex virus promoter, among others. Other nonviral 
promoters, such as a promoter derived from the murine 
metallothionein gene, will also find use for mammalian 
expression. Typically, transcription termination and 
polyadenylation sequences will also be present, located 
3' to the translation stop codon. Preferably, a sequence 
for optimization of initiation of translation, located 5' 
to the coding sequence, is also present. Examples of 
transcription terminator/polyadenylation signals include 
those derived from SV40, as described in Sambrook, et 
al., supra, as well as a bovine growth hormone terminator 
sequence. Iritrons, containing splice donor and acceptor 
sites, may also be designed into the constructs for use 
with the present invention (Chapman et al . , Nuc. Acids 
Res. (1991) 19:3979-3986). 

Enhancer elements may also be used herein to 
increase expression levels of the mammalian constructs. 
Examples include the SV40 early gene enhancer, as 
described in Dijkema et al., EMBO J. (1985) 4:761, the 
enhancer/promoter derived from the long terminal repeat 
(LTR) of the Rous Sarcoma Virus, as described in Gorman 
et al., Proc. Natl. Acad. Sci . USA (1982b) 79:6777 and 
elements derived from human CMV, as described in Boshart 
et al., celi (1985) 41:521, such as elements included in 
the CMV intron A sequence (Chapman et al . , Nuc. Acids 
Res. (1991) 19:3979-3986). 
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The desired synthetic Gag polypeptide encoding 
sequences can be cloned into any number of commercially 
available vectors to generate expression of the 
•-■ polypeptide in an appropriate host system. These systems 
5 include, but are not limited to, the following: 

baculovirus expression {Reilly, P.R., et al . , Baculovirhs 
' Expression Vectors: A Laboratory Manual (1992); Beames, et al., 
Bio techniques 11:378 (1991); Pharmingen; Clontech, Palo 
Alto, CA)}, vaccinia expression {Earl, P. L., et al . , 
10 "Expression of proteins in mammalian cells using 

vaccinia" In Current Protocols in Molecular Biology (F. 
M. Ausubel, et al. Eds.), Greene Publishing Associates & 
Wiley Interscience, New York (1991); Moss, B., et al . , 
U.S. Patent Number 5,135,855, issued 4 August 1992}, 
15 expression in bacteria {Ausubel, F.M., et al., Current 
Protocols in Molecul ar Biolory . John Wiley and Sons, Inc., 
Media PA; Clontech}, expression in yeast {Rosenberg, s. 
and Tekamp-Olson, P., U.S. Patent No. RE35,749, issued, 
March 17, 1998; Shuster, J.R. , U.S. Patent No. 5,629,203, 
issued May 13, 1997; Gellissen, G. , et al . , Antonie Van 
Leeuwenhoek, 62 (1-2) : 79-93 (1992); Romanos, M.A. , et al . , 
Yeast 8(6) .-423-488 (1992); Goeddel, D.V., Methods in 
Enzymology 185 (1990); Guthrie, C, and G.R. Fink, 
Methods in Enzymology 194 (1991)}, expression in 
mammalian cells {Clontech; Gibco-BRL, Ground Island, NY; 
e.g., Chinese hamster ovary (CHO) cell lines (Haynes, J. , 
et al., Nuc. Acid. Res. 11:687-706 (1983); 1983, Lau, 
Y.F., et al., Mol . Cell. Biol. 4:1469-1475 (1984); 
Kaufman, R. J., "Selection and coamplif ication of 
heterologous genes in mammalian cells," in Methods in 
Enzymology, vol. 185, pp537-566. Academic Press, Inc., 
San Diego CA (1991)}, and expression in plant cells 
{plant cloning vectors, Clontech Laboratories, Inc., Palo 
Alto, CA, and Pharmacia LKB Biotechnology, Inc., 
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Pistcataway, NJ ; Hood, E., et al., j. Bacterial 
168:1291-1301 ( 1986) , Nagel , r . , etaJ ., PJ2MS Microbiol 
Lett. 67:325 <i 9 90J, An, et al. , "Binary Vectors", and 
° therS in P^nt Molecular Riol^, M:1 _ 19 - (198a) . 

Miki, B.L.A., et al., pp. 249-265, and others in Plant_DNA 
Infections Aq Pni-e (Hohn, T. , et al . , eds.) Springer- 
Verlag, Wien, Austria, (1987) ; Plant Molecular Biology- 
Essential Techniques, p.g. Jones and J. M . Sutton, New 
York, J. Wiley, 1997; Miglani, Gurbachan Dictionary of 
Plant Genetics and Molecular Biology. New York, Food 
Products Press, 1998; Henry, R . j. , Practical 
Applications of Plant Molecular Biology. New York, 
Chapman & Hall, 1997}. 

Also included in the invention is an expression 
vector, such as the CMV promoter-containing vectors 
described in Example 1, containing coding sequences and 
expression control elements which allow expression of the 
coding regions in a suitable host. The control elements 
generally- include a promoter, translation initiation 
codon, and translation and transcription termination 
sequences, and an insertion site for introducing the 
insert into the vector. Translatxonal control elements 
have been reviewed by M. Kozak (e.g., Kozak, M. , Maw 
Genome 7(8) :563-574, 1996; Kozak, M. , Biochimie 
76(9):815-821, 1994; Kozak, M. , J Cell Biol 
108(2) :22 9-241, 1989; Kozak, M. , and Shatkin, A.J., 
Methods Enzymol 60:360-375, 1979). 

Expression in yeast systems has the advantage of 
commercial production. Recombinant protein production by 
vaccinia and CHO cell line have the advantage of being 
mammalian expression systems. Further, vaccinia virus 
expression has several advantages including the 
following: (i) its wide host range; (ii) faithful posfc . 



52 



WO 00/39302 



PCT/US99/31245 



transcriptional modification, processing, folding, 
transport, secretion, and assembly of recombinant 
proteins; (iii) high level expression of relatively 
"•"soluble recombinant proteins; and (iv) a large capacity 
5 to accommodate foreign DNA. 

The recombinantly expressed polypeptides from 
synthetic Gag-encoding expression cassettes are typically 
isolated from lysed cells or culture media. Purification 
can be carried out by methods known in the art including 

10 salt fractionation, ion exchange chromatography, gel 
filtration, size-exclusion chromatography, size- 
fractionation, and affinity chromatography. 
Immunoaf f inity chromatography can be employed using 
antibodies generated based on, for example, Gag antigens. 

15 Advantages of expressing the Gag -containing proteins 

of the present invention using mammalian cells include, 
but are not limited to, the following: well-established 
protocols for scale-up production; the ability to produce 
VLPs; cell lines are suitable to meet good manufacturing 

20 process (GMP) standards; culture conditions for mammalian 
cells are known in the art. 



2.1.4 Modification of HIV-1 Env Nucleic Acid Coding 

Sequences 

25 One aspect of the present invention is the 

generation of HIV-1 Env protein coding sequences, and 
related sequences, having improved expression relative to 
the corresponding wild- type sequence. Exemplary 
embodiments of the present invention are illustrated 

30 herein modifying the Env protein wild-type sequences 

obtained from the HIV-1 subtype B strains HIV-1US4 and 
HIV-1SF162 (Myers et al . , Los Alamos Database, Los Alamos 
National Laboratory, Los Alamos, New Mexico (1992); Myers 
et al., Human Retroviruses and Aids, 1991, Los Alamos, 
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New Mexico: Los Alamos National Laboratory) . Env sequence 
obtained from other HIV variants may be manipulated in 
similar fashion following the teachings of the present 
specification. Such other variants include those 
described above in Section 2.1.1 and on the World Wide 
Web (Internet) , for example at http;//hiv- 
web . lanl . gov/cgi -bi n /h i vDB3 /nnhl i r /wdb/asamniiHi i r* and 
httD: //hiv-w eb.lanl .gov 

First, the HIV-1 codon usage pattern was modified so 
that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes (Example 1). The HIV codon usage reflects a high 
content of the nucleotides A or T of the codon- triplet . 
The effect of the HIV-1 codon usage is a high AT content 
in the DNA sequence that results in a decreased 
translation ability and instability of the mRNA. In 
comparison, highly expressed human codons prefer the 
nucleotides G or C. The Env coding sequences were 
modified to be comparable to codon usage found in highly 
expressed human genes. Experiments performed in support 
of the present invention showed that the synthetic Env 
sequences were capable of higher level of protein 
production (see the Examples) relative to the native Env 
sequences. One reason for this increased production may 
be increased stability of the mRNA corresponding to the 
synthetic Env coding sequences versus the mRNA 
corresponding to the native. Env coding sequences. 

Modification of the Env polypeptide coding sequences 
resulted in improved expression relative to the wild-type 
coding sequences in a number of mammalian cell lines. 
Similar Env polypeptide coding sequences can be obtained 
from a variety of isolates (families, sub-types, etc.). 
Env polypeptide encoding sequences derived from these 
variants can be optimized and tested for improved 
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10 



expression in mammals by following the teachings of the 
present specification (see the Examples, in particular 
Example 2) . 

2.1.5 Further Modification op HIV-1 Env Nocleic Acid 

Coding Sequences 
In addition to proteins containing HIV-related 
sequences, the Env encoding sequences of the present 
invention can be fused to other polypeptides (creating 
chimeric polypeptides) . Also, variations on the 
orientation of the Env and other coding sequences, 
relative to each other, are contemplated. Further, the 
HIV protein encoding cassettes of the present invention 
can be co-expressed using one vector or multiple vectors 
15 In addition, the polyproteins can be operably linked to 
the same or different promoters. 

Env polypeptide coding sequences can be obtained 
from any HIV isolates (different families, subtypes, and 
strains) including but not limited to the isolates HIV I1Ib , 
20 HIV SF2 , HIV ue4 , HI VcM2 3 5 , HIV SMM , HI Vlav , hiV^. HIV,,,) (see, 
e.g., Myers et al . , Los Alamos Database, Los Alamos 
National Laboratory, Los Alamos, New Mexico (1992); Myers 
, et al., Human Retroviruses and Aids, 1997, Los Alamos 
New Mexico: Los Alamos National Laboratory) . Synthetic 
25 . expression cassettes can be generated using such coding 

sequences as starting material by following the teachings 
of the present specification (e.g., see Example 1) . 
Further, the synthetic expression cassettes (and purified 
polynucleotides) of the present invention include related 
'0 Env polypeptide coding sequences having greater than 90% 
preferably greater than 92%, more preferably greater than 
95%, and most preferably greater than 98% sequence 
identity to the synthetic expression cassette sequences 
disclosed herein (for example, SEQ ID N0s:71-72 ; and/or 
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the sequences presented in Tables 1A and IB) when the 
sequences of the present invention are used as the query 
sequence . 

2.1.6 Expression of Synthetic Sequences Encoding HIV-1 

Env and Related Polypeptides 
Several synthetic Env-encoding sequences (expression 
cassettes) of the present invention were cloned into a 
number of different expression vectors (Example 1) to 
evaluate levels of expression and production of Env 
polypeptide. A modified synthetic coding sequence is 
presented as synthetic Env expression cassettes (Example 
1, e.g., Tables 1A and IB). The synthetic DNA fragments 
for Env were cloned into eucaryotic expression vectors 
described in Example 1 and in Section 2.1.3 above, 
including, a transient expression vector and CMV- 
promoter-based mammalian vectors. Corresponding wild- 
type sequences were cloned into the same vectors. 

These vectors were then transfected into a several 
different cell types, including a variety of mammalian 
cell lines, (293, RD, COS-7, and CHO, cell lines 
available, for example, from the A.T.C.C.). The cell 
lines were cultured under appropriate conditions and the 
levels of g P 120, gpl40 and gpieo Env expression in 
supernatants were evaluated (Example 2) . Env 
polypeptides include, but are not limited to, for 
example, native gpl60, oligomeric gpl40, monomeric gpl20 
as well as modified sequences of these polypeptides. The 
results of these assays demonstrated that expression of 
synthetic Env encoding sequences were significantly 
higher than corresponding wild- type sequences (Example 2; 
Tables 3 and 4) . 

Further, Western Blot analysis showed that cells 
containing the synthetic Env expression cassette produced 
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the expected protein (gpl20, gpl40 or gpl60) at higher 
per-cell concentrations than cells containing the native 
expression cassette. The Env proteins were seen in both 
cell lysates and supernatants . The levels of production 
5 were significantly higher in cell supernatants for cells 
transfected with the synthetic Env expression cassettes 
of the present invention as compared to wild type. 

Fractionation of the supernatants from mammalian 
cells transfected with the synthetic Env expression 
10 cassettes showed that it provides superior production of 
Env proteins, relative to the wild-type Env sequences 
(Examples 2 and 3). 

Efficient expression of these Env-containing 
polypeptides in mammalian cell lines provides the 
15 following benefits: the Env polypeptides are free of 
baculovirus or other viral contaminants; production by 
established methods approved by the FDA; increased 
purity; greater yields (relative to native coding 
sequences) ; and a novel method of producing the Env- 
2 0 containing polypeptides in CHO cells which is less 
feasible in the absence of the increased expression 
obtained using the constructs of the present invention. 

Exemplary cell lines (e.g., mammalian, yeast, 
insect, etc.) include those described above in Section 
25 2.1.3 for Gag -containing constructs. Further, appropriate 
vectors and control elements (e.g., promoters, enhancers, 
polyadenylation sequences, etc.) for any given cell type 
can be selected, as described above in Section 2.1.3, by 
one having ordinary skill in the art in view of the 
teachings of the present specification and information 
known in the art about expression vectors. In addition, 
the recombinantly expressed polypeptides from synthetic 
Env-encoding expression cassettes are typically isolated 
and purified from lysed cells or culture media, as 
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described above for Gag-encoding expression cassettes. 
An exemplary purification is described in Example 4 and 
shown in Figure 60. 

5 2.1.7 Modification op HIV-1 Tat Nucleic Acid Coding 

Sequences 

Another aspect of the present invention is the 
generation of HIV-1 tat protein coding sequences, and 
related sequences, having improved expression relative to 
10 the corresponding wild- type sequence. Exemplary 

embodiments of the present invention are illustrated 
herein modifying the tat wild-type nucleotide sequence 
(SEQ ID NO: 85, Figure 76) obtained from SF162 as 
described above. Exemplary synthetic tat constructs are 
15 shown in SEQ ID NO: 87, which depicts a tat construct 

encoding a full-length tat polypeptide from strain SF162; 
SEQ ID NO: 88, which depicts a tat construct encoding a 
tat polypeptide having the cystein residue at position 22 
changed; and SEQ ID NO: 89, which depicts a tat construct 

20 encoding the amino terminal portion of a tat polypeptide 
from strain SF162 . The amino portion of the tat protein 
appears to contain many of the epitopes that- induce an 
immune response. In addition, further modifications 
include replacement or deletion of the cystein residue at 

25 position 22, for example with a valine residue, an 

alanine residue or a glycine residue (SEQ ID Nos : 88 and 
8.9, Figures 79 and 81), see, e.g., Caputo et al . (1996) 
Gene Ther. 3:235. In Figure 81, which depicts a tat 
construct encoding the amino terminal portion of a tat 

30 polypeptide, the nucleotides (nucleotides 64-66) encoding 
the cystein residues are underlined. The design and 
construction of suitable construct can be readily done 
using 
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the teachings of the present specification. As with Gag, 
»" pel, prot and Env, tat polypeptide coding sequences can ' 

be obtained from a variety of isolates (families, sub- 
* types, etc. ) . 

Modification of the tat polypeptide coding sequences 
result in improved expression relative to the wild-type 
coding sequences in a number of cell lines (e.g., 
mammalian, yeast, bacterial and insect cells). Tat 
polypeptide encoding sequences derived from these 
variants can be optimized and tested for improved 
expression in mammals by following the teachings of the 
present specification (see the Examples, in particular 
Example 2) . 

Various forms of the different embodiments of the 
invention, described herein, may be combined. For 
example, polynucleotides may be derived from the 
polynucleotide sequences of the present invention, 
including, but not limited to, coding sequences for Gag 
polypeptides, Env polypeptides, polymerase polypeptides, 
protease polypeptides, tat polypeptides, and reverse 
transcriptase polypeptides. Further, the polynucleotide 
coding sequences of the present invention may be combined 
■ into multi-cistronic expression cassettes where typically 
each coding sequence for each polypeptide is preceded by 
25 IRES sequences. 

2.2 Production op Virus -like Particles and Use op the 

Constructs of the Present Invention to create Packaging 
cell lines 

. The group-specific antigens (Gag) of human 
immunodeficiency virus type-1 (HIV-1) self -assemble into 
noninfectious virus -like particles (VLP) that are 
released from various eucaryotic cells by budding 
(reviewed by Freed, E.G., Virology 251 : 1-15, 1998) The 
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synthetic expression cassettes of the present invention 
provide efficient means for the production of HIV-Gag 
virus -like particles (VLPs) using a variety of different 
cell types, including, but not limited to, mammalian 
> cells. 

Viral particles can be used as a matrix for the 
proper presentation of an antigen entrapped or associated 
therewith. to the immune system of the host. For example, 
U.S. Patent No. 4,722,840 describes hybrid particles 
comprised of a particle -forming fragment of a structural 
protein from a virus, such as a particle- forming fragment 
of hepatitis B virus (HBV) surface antigen (HBsAg) , fused 
to a heterologous polypeptide. Tindle et al . , Virology 
(1994) 200=547-557, describes the production and use of 
chimeric HBV core antigen particles containing epitopes 
of human papillomavirus (HPV) type 16 E7 transforming 
protein. 

Adams et al., Mature (1987) 329:68-70, describes the 
recombinant production of hybrid HIVgpl20:Ty VLPs in 
yeast and Brown et al.. Virology (1994) 198:477-488, the 
production of chimeric proteins consisting of the VP2 
protein of human parvovirus B19 and epitopes from human 
herpes simplex virus type 1, as well as mouse hepatitis 
virus A59. Wagner et al . , (Virology (1994) 200:162-175, 
25 Brand et al., J. Virol. Meth. (1995) 51:153-168; Virology 
(1996) 220:128-140) and Wolf, et al . , (EP 0 449 116 Al, 
published 2 October 1991; WO 96/30523, published 3 
October 1996) describe the assembly of chimeric HIV-l 
P 55Gag particles. U.S. Patent No. 5,503,833 describes 
the use of rotavirus VP 6 spheres for encapsulating and 
delivering therapeutic agents. 



20 



30 



60 



10 



15 



20 



25 



30 



WO 00/39302 

PCT/US99/31245 

2.2.1 VLP Production using the stnthetic expression 

CASSETTES OP THE PRESENT INVENTION 

Experiments performed in support of the present 
invention have demonstrated that the synthetic expression 
cassettes of the present invention provide superior 
production of both protein and VLPs, relative to native 
- coding sequences (Examples 7 and 15). Further, electron 
• nucroscopic evaluation of VLP production (Examples 6 and 
- 15, Figures 3A-B and 65A-F) showed that free and budding 
immature virus particles of the expected size were 
produced by cells containing the synthetic expression 
cassettes. 

Using the synthetic expression cassettes of the 
present invention, rather than native coding sequences, 
for the production of virus-like particles provide 
several advantages. First, VLPs can be produced in 
enhanced quantity making isolation and purification of 
the VLPs easier. Second, VLPs can be produced in a 
variety of cell types using the synthetic expression 
cassettes, in particular, mammalian cell lines can be 
used for VLP production, for example, CHO cells 
Production using CHO cells provides (i) VLP formation; 
(ID correct myristylation and budding; (iii) abse nce of 
non-mammalian cell contaminants (e.g., insect viruses 
and/or cells); and (iv) ease of purification. The 
synthetic expression cassettes of the present invention 
are also useful for enhanced expression in cell-types 
other than mammalian cell lines. For example, infection 
of insect cells with baculovirus vectors encoding the 
. synthetic expression cassettes resulted in higher levels 
of total protein yield and higher levels of VLP 
Production (relative to wild-type coding sequences) 
Further, the final product from insect cells infected 
with the baculovirus-Gag synthetic expression cassettes 
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consistently contained lower amounts of contaminating 
insect proteins than the final product when wild- type 
coding sequences were used (Examples) . 

VLPs can spontaneously form when the particle - 
i forming polypeptide of interest is recombinantly 

expressed in an appropriate host cell. Thus, the VLPs 
produced using the synthetic expression cassettes of the 
present invention are conveniently prepared using 
recombinant techniques. As discussed below, the Gag 
polypeptide encoding synthetic expression cassettes of 
the present invention can include other polypeptide 
coding sequences of interest (for example, Env, tat, rev, 
HIV protease, HIV polymerase, HCV core; see, Example 1). 
Expression of such synthetic expression cassettes yields 
VLPs comprising the product of the synthetic expression 
cassette, as well as, the polypeptide of interest. 

Once coding sequences for the desired particle- 
forming polypeptides have been isolated or synthesized, 
they can be cloned into any suitable vector or replicon 
for expression. Numerous cloning vectors are known to 
those of skill in the art, and the selection of an 
appropriate cloning vector is a matter of choice. See, 
generally, Ausubel et al, supra or Sambrook et al, supra. 
The vector is then used to transform an appropriate host 
cell. Suitable recombinant expression systems include, 
but are not limited to, bacterial, mammalian, 
baculovirus/insect, vaccinia, Semliki Forest virus (SFV) , 
Alphaviruses (such as, Sindbis, Venezuelan Equine 
Encephalitis (VEE) ) , mammalian, yeast and Xenopus 
expression systems, well known in the art. Particularly 
preferred expression systems are mammalian cell lines, 
vaccinia, Sindbis, insect and yeast systems. 

For example, a number of mammalian cell lines are 
known in the art and include immortalized cell lines 
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available from the American Type Culture Collection 
• (A.T.C.C.), such as, but not limited to, Chinese hamster 

ovary (CHO) cells, 293 cells, HeLa cells, baby hamster 
• kidney (BHK) cells, mouse myeloma (SB20) , monkey kidney 
cells (COS), as well as others. Similarly, bacterial 
hosts such as E. coli, Bacillus subtilis, and 
Streptococcus spp. , will find use with the present 
expression constructs. Yeast hosts useful in the present 
invention include inter alia, Saccharomyces cerevisiae, 
Candida albicans, Candida maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis, Kluyveromyces lactis, Pichia 
guillerimondii, Pichia pastoris, Schizosaccharomyces 
pombe and Yarrowia lipolytica. Insect cells for use with 
baculovirus expression vectors include, inter alia, Aedes 
aegypti, Autographa calif ornica, Bombyx mori, Drosophila 
melanogaster, Spodoptera frugiperda, and Trichoplusia ni . 
See, e.g., Summers and Smith, Texas Agricultural 
Experiment Station Bulletin No. 1555 (1987) . Fungal 
hosts include, for example, Aspergillus. 

Viral vectors can be used for the production of 
particles in eucaryotic cells, such as those derived from 
the pox family of viruses, including vaccinia virus and 
avian poxvirus. Additionally, a vaccinia based 
infection/transfection system, as described in Tomei et 
al., J. Virol. (1993) 67:4017-4026 and Selby et al . , J. 
Gen. Virol. (1993). 74:1103-1113, will also find use with 
the present invention. in this system, cells are first 
infected in vitro with a vaccinia virus recombinant that 
encodes the bacteriophage T7 RNA polymerase. This 
polymerase displays exquisite specificity in that it only 
transcribes templates bearing T7 promoters. Following 
infection, cells are transfected with the DNA of 
interest, driven by a T7 promoter. The polymerase 
expressed in the cytoplasm from the vaccinia virus 
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recombinant transcribes the trans fected DNA into RNA 
which is then translated into protein by the host 
transnational machinery. Alternately, T7 can be added as 
a purified protein or enzyme as in the "Progenitor" 
-> system (Studier and Moffatt, J. Mol . Biol. (1986) 
189:113-130). The method provides for high level, 
transient, cytoplasmic production of large quantities of 
RNA and its translation product (s) . 

Depending on the expression system and host 
selected, the VLPS are produced by growing host cells 
transformed by an expression vector under conditions 
whereby the particle- forming polypeptide is expressed and 
VLPs can be formed. The selection of the appropriate 
growth conditions is within the skill of the art. If the 
VLPs are formed intracellular^ , the cells are then 
disrupted, using chemical, physical or mechanical means, 
which lyse the cells yet keep the VLPs substantially 
intact. Such methods are known to those of skill in the 
art and are described in, e.g., Protein Purification 
Applications: A Practical Approach, (E.L.V. Harris and 
S. Angal, Eds., 1990). 

The particles are then isolated (or substantially 
purified) using methods that preserve the integrity 
thereof , such as, by density gradient cent rifugat ion, 
25 e.g., sucrose gradients, PEG-precipitation, pelleting, 
and the like (see, e.g., Kirnbauer et al . J. Virol. 
(1993) 67:6929-6936), as well as standard purification 
techniques including, e.g., ion exchange and gel 
filtration chromatography. 

VLPs produced by cells containing the synthetic 
expression cassettes of the present invention can be used 
to elicit an immune response when administered to a 
subject, one advantage of the present invention is that 
VLPs can be produced by mammalian cells carrying the 
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synthetic expression cassettes at levels previously not 
possible. as discussed above, the VLPs can comprise a 
variety of antigens in addition to the Gag polypeptides 
(e.g., Env, tat, Gag-protease, Gag -polymerase, Gag-HCV- 
< core) . Purified VLPs, produced using the synthetic 
expression cassettes of the present invention, can be 
administered to a vertebrate subject, usually in the form 
of vaccine compositions. Combination vaccines may also 
be used, where such vaccines contain, for example, other 
subunit proteins derived from HIV or other organisms 
(e.g., env) or gene delivery vaccines encoding such 
antigens. Administration can take place using the VLPs 
formulated alone or formulated with other antigens . 
Further, the VLPs can be administered prior to, 
concurrent with, or subsequent to, delivery of the 
synthetic expression cassettes for DNA immunization (see 
below) and/or delivery of other vaccines. Also, the site 
of VLP administration may be the same or different as 
other vaccine compositions that are being administered. 
Gene delivery can be accomplished by a number of methods 
including, but are not limited to, immunization with DNA, 
alphavirus vectors, pox virus vectors, and vaccinia virus 
vectors. 

VLP immune-stimulating (or vaccine) compositions can 
include various excipients, adjuvants, carriers, 
auxiliary substances, modulating agents, and the like. 
The immune stimulating compositions will include an 
amount of the VLP/antigen sufficient to mount an 
immunological response. An appropriate effective amount 
can be determined by one of skill in the art. Such an 
amount will fall in a relatively broad range that can be 
determined through routine trials and will generally be 
an amount on the order of about 0.1 Mg to about 1000 „g. 



65 



10 



15 



20 



25 



30 



WO 00/39302 

PCT/US99/31245 

more preferably about 1 fig to about 300 /ig, of 
VLP/ antigen. 

A carrier is optionally present which is a molecule 
that does not itself induce the production of antibodies 
> harmful to the individual receiving the composition. 

Suitable carriers are typically large, slowly metabolized 
macromolecules such as proteins, polysaccharides, 
polylactic acids, polyglycollic acids, polymeric amino 
acids, amino acid copolymers, lipid aggregates (such as 
oil droplets or liposomes) , and inactive virus 
particles. Examples of particulate carriers include 
those derived from polymethyl methacrylate polymers, as 
well as microparticles derived from poly (lactides) and 
poly(lactide-co-glycolides) , known as PLC. See, e.g., 
Jeffery et al . , Pharm. Res, (1993) 10:362-368; McGee JP, 
et al., J Microencapsul. 14 (2) : 197-210, 1997; O'Hagan DT, 
et al., Vaccine 11 (2 ) : 149-54 , 1993. Such carriers are 
well known to those of ordinary skill in the art. 
Additionally, these carriers may function as 
immunostimulating agents ("adjuvants'-). Furthermore, the 
antigen may be conjugated to a bacterial toxoid, such as 
toxoid from diphtheria, tetanus, cholera, etc., as well 
as toxins derived from E. coli. 

Such adjuvants include, but are not limited to: (1) 
aluminum salts (alum), such as aluminum hydroxide, 
aluminum phosphate, aluminum sulfate, etc.; (2) oil-in- 
water emulsion formulations (with or without other 
specific immunostimulating agents such as muramyl 
peptides (see below) or bacterial cell wall components), 
such as for example (a) MF59 (International Publication 
No. WO 90/14837), containing 5% Squalene, 0.5% Tween 80, 
and 0.5% Span 85 (optionally containing various amounts 
of MTP-PE (see below) , although not required) formulated 
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into submicron particles using a microf luidizer such as 
Model HOY microf luidizer (Microf luidics, Newton, MA), 

(b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% 
pluronic-blocked polymer L121, and thr-MDP (see below) 
either microf luidized into a submicron emulsion or 
vortexed to generate a larger particle size emulsion, and 

(c) 'Ribi" adjuvant system (RAS) , (Ribi Immunochem, 
Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, and 
one or more bacterial cell wall components from the group 
consisting of monophosphorylipid A (MPL) , trehalose 
dimycolate (TDM) , and cell wall skeleton (CWS) , 
preferably MPL + CWS (Detox™) ; (3) saponin adjuvants, 
such as Stimulon™ (Cambridge Bioscience, Worcester, MA) 
may be used or particle generated therefrom such as 

15 ISCOMs (immunostimulating complexes) ; (4) Complete 

Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant 
(IFA) ; (5) cytokines, such as interleukins (IL-l, il-2, 
etc.), macrophage colony stimulating factor (M-CSF) , 
tumor necrosis factor (TNF) , beta chemokines (MIP, l- 
alpha, 1-beta Rantes, etc.); (6) detoxified mutants of a 
bacterial ADP-ribosylating toxin such as a cholera toxin 
(CT) , a pertussis toxin (PT) , or an E. coli heat-labile 
toxin (LT) , particularly LT-K63 (where lysine is 
substituted for the wild-type amino acid at position 63) 
LT-R72 (where arginine is substituted for the wild- type 
amino acid at position 72) , CT-S109 (where serine is 
substituted for the wild-type amino acid at position 
109),. and PT-K9/G129 (where lysine is substituted for the 
wild-type amino acid at position 9 and glycine 
substituted at position 129) (see, e.g., International 
Publication Nos. W093/13202 and W092/19265) ; and (7) 
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other substances that act as immunostimulating agents to 
enhance the effectiveness of the composition. 

Muramyl peptides include, but are not limited to, N- 
acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP) , N - 
> acteyl-normuramyl-L-alanyl-D-isogluatme (nqr-MDP) , N - 
acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine-2- (1 • - 

2 • :dipalmitoyl- S n-glycero-3-huydroxyphosphoryloxy) - 
ethylamine (MTP-PE) , etc. 

Dosage treatment with the VLP composition may be a 
single dose schedule or a multiple dose schedule . a 
multiple dose schedule is one in which a primary course 
of vaccination may be with l-io separate doses, followed 
by other doses given at subsequent time intervals, chosen 
to maintain and/or reinforce the immune response, for 
example at 1-4 months for a second dose, and if needed, a 
subsequent dose(s) after several months. The dosage 
regimen will also, at least in part, be determined by the 
potency of the modality, the vaccine delivery employed, 
the need of the subject and be dependent on the judgment 
of the practitioner. 

If prevention of disease is desired (e.g., reduction 
of symptoms, recurrences or of disease progression), the 
antigen carrying VLPs are generally administered prior to 
primary infection with the pathogen of interest. If 
treatment is desired, e.g., the reduction of symptoms or 
recurrences, the VLP compositions are generally 
administered subsequent to primary infection. 

2.2.2 USING THE SYNTHETIC EXPRESSION CASSETTES OF THE 

PRESENT INVENTION TO CREATE PACKAGING CELL LINES 

A number of viral based systems have been developed 
for use as gene transfer vectors for mammalian host 
cells. For example, retroviruses (in particular, 
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lentiviral vectors) provide a convenient platform for 
gene delivery systems. A coding sequence of interest 
(for example, a sequence useful for gene therapy 
applications) can be inserted into a gene delivery vector 
5 and packaged in retroviral particles using techniques 

known in the art. Recombinant virus can then be isolated 
and delivered to cells of the subject either in vivo or 
ex vivo. A number of retroviral systems have been 
described, including, for example, the following: (U.S. 
10 Patent No. 5,219,740; Miller et al . (1989) Bio techniques 
7:980; Miller, A.D. (1990) Human Gene Therapy 1:5; Scarpa 
et al. (1991) Virology 180:849; Burns et al . (1993) Proc. 
Natl. Acad. Sci . USA 90:8033; Boris-Lawrie et al . (1993) 
Cur. Opin. Genet. Develop. 3:102; GB 2200651; EP 0415731; 
15 EP 0345242; WO 89/02468; WO 89/05349; WO 89/09271; WO 
90/02806; WO 90/07936; WO 90/07936; WO 94/03622; WO 
93/25698; WO 93/25234; WO 93/11230; WO 93/10218; WO 
91/02805; in U.S. 5,219,740; U.S. 4,405,712; U.S. 
4,861,719; U.S. 4,980,289 and U.S. 4,777,127; in U.S. 
20 Serial No. 07/800 , 921 ; and in Vile (1993) Cancer Res 
53:3860-3864; Vile (1993) Cancer Res £3 : 962 - 967 ; Ram 
(1993) Cancer Res 53:83-88; Takamiya (1992) JNeurosci 
Res 33:493-503; Baba (1993) J Neurosurg 79 : 729-735 ; Mann 
(1983) Cell 33:153; Cane (1984) Proc Natl Acad Sci USA 
25 8i;6349; and Miller (1990) Human Gene Therapy 1. 

Sequences useful for gene therapy applications 
include, but are not limited to, the following. Factor 
VIII cDNA, including derivatives and deletions thereof 
(International Publication Nos. WO 96/21035, WO 97/03193, 
30 WO 97/03194, WO 97/03195, and WO 97/03191). Factor IX 
cDNA (Kurachi et al . (1982) Proc. Natl. Acad. Sci. USA 
79:64 61-6464) . Factor V cDNA can be obtained from pMT2-V 
(Jenny (1987) Proc. Natl. Acad. Sci. USA 84:4846, 
A.T.C.C. Deposit No. 40515). A full-length factor V 
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cDNA, or a B domain deletion or B domain substitution 
thereof, can be used. B domain deletions of factor V 
include those reported by Marquette (1995) Blood 86-3026 
and Kane (1990) Biochemistry 29 .- 6762 . Antithrombin III 
> CDNA (Prochownik (1983) J. Biol. Chem. 258:8389 A T C C 
Deposit No. 57224/57225). Protein C encoding cDNA 
(Foster (1984) Proa. Natl. Acad. Sci . USA .81:4 766- 
Beckmann (1985) Nucleic Acids Res. 13:5233). Prothrombin 
cDNA can be obtained by restriction enzyme digestion of a 
published vector (Degen (1983) Biochemistry 22-2087) 
The endothelial cell surface protein, thrombomodulin is 
a necessary cofactor for the normal activation of protein 
C by thrombin. A soluble recombinant form has been 
described (Parkinson (1990) J. Biol. Chen,. 265:12602- 
Jackman (1987) Proc. Natl. Acad. Sci. ^84:6425; Shirai 
(1988) J. Biochem. 10^:281; Wen (1987) Biochemistry 
26:4350; Suzuki (1987) EMBO J. 6:1891, A.T.C.C. Deposit 
No. 61348, 61349). 

Many genetic diseases caused by inheritance of 
defective genes result in the failure to produce normal 
gene products, for example, thalassemia, phenylketonuria 
Lesch-Nyhan syndrome, severe combined immunodeficiency 
(SCID), hemophilia A and B, cystic fibrosis, Duchenne-s 
Muscular Dystrophy, inherited emphysema and familial 
hypercholesterolemia (Mulligan et al . (1993) Science 
260:926; Anderson et al . (1992) Science 256 : 808 , Friedman 
et al. (1989) Science 244:1275). Although genetic 
diseases may result in the absence of a gene product 
endocrine disorders, such as diabetes and 
hypopituitarism, are caused by the inability of the gene 
to produce adequate levels of the appropriate hormone 
insulin and human growth hormone respectively. 

In one aspect, gene therapy employing the constructs 
and methods of the present invention involves the 
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introduction of normal recombinant genes into T cells so 
that new or missing proteins are produced by the T cells 
after introduction or re introduction thereof into a 
patient. A number of genetic diseases have been selected 
5 for treatment with gene therapy, including adenine 
deaminase deficiency, cystic fibrosis, o,- antitrypsin 
deficiency, Gaucher 's syndrome, as well as non-genetic 
diseases. 

In particular, Gaucher' s syndrome is a genetic 
disorder characterized by a deficiency of the enzyme 
glucocerebrosidase . This enzyme deficiency leads to the 
accumulation of glucocerebroside in the lysosomes of all 
cells' in the body. For a review see Science 256:794 
(1992) and Scriver et al., The Metabolic Basis of 
Inherited Disease, 6th ed., vol. 2, page 1677). Thus, 
gene transfer vectors that express glucocerebrosidase can 
be constructed for use in the treatment of this disorder. 
Likewise, gene transfer vectors encoding lactase can be 
used in the treatment of hereditary lactose intolerance, 
those expressing AD can be used for treatment of ADA 
deficiency, and gene transfer vectors encoding 
antitrypsin can be used to treat c^-antitrypsin 
deficiency. See Ledley, F.D. (1987) J. Pediatrics 
110:157-174, Verma, I. (Nov. 1987) Scientific American 
pp. 68-84, and International Publication No. WO 95/27512 
entitled "Gene Therapy Treatment for a Variety of 
Diseases and Disorders," for a description of gene 
therapy treatment of genetic diseases. 

In still further embodiments of the invention, 
nucleotide sequences which can be incorporated into a 
gene transfer vector include, but are not limited to, 
proteins associated with enzyme -deficiency disorders, 
such as the cystic fibrosis transmembrane regulator (see, 
for example, U.S. Patent No. 5,240,84 6 and Larrick et al '. 
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(1991) Gene Therapy Applications of Molecular Biology, 
Elsevier, New York and adenosine deaminase (ADA) (see' 
U.S. Patent No. 5,399,346); growth factors, or an agonist 
or antagonist of a growth factor (Bandara et al . (1992) 
> DNA and Cell Biology, 11:227); one or more tumor 

suppressor genes such as p53, Rb, or C-CAMI (Kleinerman 
et al. (1995) Cancer Research 55:2831); a molecule that 
modulates the immune system of an organism, such as a HLA 
molecule (Nabel et al . (1993) Proc. Natl. Acad. Sci . USA 
90:11307); a ribozyme (Larsson et al . (1996) Virology 
219:161); a peptide nucleic acid (Hirshman et al . (i9 96) 
J. Invest. Med. 44:347); an antisense molecule (Bordier 
et al. (1995) Proc. Natl. Acad. Sci. USA 92:9383) which 
can be used to down-regulate the expression or synthesis 
of aberrant or foreign proteins, such as HIV proteins or 
a wide variety of oncogenes such as p53 (Hesketh, The 
Oncogene Facts Book, Academic Press, New York, (1995) ; a 
biopharmaceutical agent or antisense molecule used to' 
treat HIV- infection, such as an inhibitor of p24 
(Nakashima et al . (1994) Nucleic Acids Res. 22:5004); or 
reverse -transcriptase (see, Bordier, supra). 

Other proteins of therapeutic interest can be 
expressed in vivo by gene transfer vectors using the 
methods of the invention. For instance sustained in vivo 
expression of tissue factor inhibitory protein (TFPI) is 
useful for treatment of conditions including sepsis and 
DIC and in preventing reperfusion injury. ( S .ee 
International Publications Nos. WO 93/24143, WO 93/25230 
and WO 96/06637) . Nucleic acid seguences encoding 
various forms of TFPI can be obtained, for example, as 
described in US Patent Nos. 4,966,852; 5,106,833; and 
5,466,783, and incorporated into the gene transfer 
vectors described herein. 
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Erythropoietin (EPO) and leptin can also be 
expressed in vivo from genetically modified T cells 
according to the methods of the invention. For instance 
EPO is useful in gene therapy treatment of a variety of 
S disorders including anemia (see International Publication 
No. WO 95/13376 entitled "Gene Therapy for Treatment of 
Anemia") . Sustained delivery of leptin by the methods of 
the invention is useful in treatment of obesity. See 
International Publication No. WO 96/05309 for a 
description of the leptin gene and the use thereof in the 
treatment of obesity. 

•■- A variety of other disorders, can also be treated by 
the methods of the. invention. For example, sustained in 
vivo systemic production of apolipoprotein E or 
apolipoprotein A from genetically modified T cells can be 
used for treatment of hyperlipidemia (see Breslow et al . 
(1994) Biotechnology 12:365) . Sustained production of 
angiotensin receptor inhibitor (Goodfriend et al . (i 996 ) 
N. Engl. J. Med. 331:1469) can be provided by the methods 
described herein. As yet an additional example, the long 
term in vivo systemic production of angiostatin is useful 
in the treatment of a variety of tumors. (See CReilly et 
al. (1996) Nature Med. 2:689). 

In other embodiments, gene transfer vectors can be 
constructed to encode a cytokine or other 
immunomodulatory molecule. For example, nucleic acid 
sequences encoding native IL-2 and gamma- interferon can 
be obtained as described in US Patent Nos. 4,738,92 7 and 
5,326,859, respectively, while useful muteins of these 
proteins can be obtained as described in U.S. Patent No. 
4,853,332. Nucleic acid sequences encoding the short and 
long forms of mCSF can be obtained as described in US 
Patent Nos. 4,847,201 and 4,879,227, respectively. m 
particular aspects of the invention, retroviral vectors 
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expressing cytokine or immunomodulatory genes can be 
produced as described herein (for example, employing the 
Packaging cell lines of the present invention) and in 
international Application No. PCT US 94/02951, entitled 
• "Compositions and Methods for Cancer Immunotherapy » 

Examples of suitable immunomodulatory molecules for 
use herein include the following: IL-1 and IL-2 (Karupiah 
et al. (1990) J. Immunology 144:290-298, Weber et al 
(1987) J. EXP . Med. 166:171 6 :i733, Gansbacher et al ' 
(1990) J. Exp . Med . r72 :1217 . 1224> and „ s Nq 

4,738,927); IL-3 and IL-4 (Tepper et al . (1989) Cell 
57:503-512, Golumbek et al . (1991) Science 254 - 713-716 
and U.S. Patent No. 5,017,691); 1L _ 5 and IL _ 6 (Bnke ^ c 
et al. (1987) J. Immunol. 139:4116-4121, and 
International Publication No. wo 90/06370); i L -7 (u s 
Patent No. 4,965,195); n,- 8f IL . 9> IL . 10> ^ 
and IL-13 (Cytokine Bulletin, Summer 1994); i L -14 and ' 
IL-15; alpha interferon (Pinter et al . (1991) Drugs 
42:749-765, U.S. Patent Nos. 4,892,743 and 4,966,843 
International Publication No. WO 85/02862, Nagata et'al 
(1980) mature 284:316-320, Familletti et al (1981) 
Methods in Em. 78:387-394, Twu et al . (1989) Proc . NatJ 
Acad. Sci. USA 86 = 2046-2050, and Faktor et al . < 1990) 
Oncogene 5:867-872); beta-interf eron (Seif et al . (1991) 
J. Virol. 65:664-671); gamma- interferons (Radford et al 
(1991) The American Society of Hepatology 20082015 
Watanabe et al . (1989) Proc. Natl. Acad. Sci USA ' 
8^:9456-9460, Gansbacher et al . (1990) Cancer Research 
50:7820-7825, Maio et al . (1989) Can. Immunol, 
irnmunother. 30:34-42, and U.S. Patent Nos. 4,762,791 and 
4,727,138); G-CSF (U.S. Patent Nos. 4,999,291 and 
4.810,643); GM-CSF (International Publication No. WO 
85/04188); tumor necrosis factors (TNFs) (Jayaraman et 
al- (1990) J. immunology 144:942-951); CD3 (Krissanenet 
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al. (1987) Immunogenetics 26:258-266) ; ICAM-1 (Altman et 
al. (1989) Nature 338:512-514, Simmons et al . (1988) 
Nature 331:624-627) ; ICAM-2, LFA-1 , LFA-3 (Wallner et al . 
(1987) J . Exp. Med. 166:923-932); MHC class I molecules, 
5 MHC class II molecules, B7.1-.3, 3 2 -microglobulin (Parnes 
et al- (1981) Prqc. Natl. Acad. Sci . USA ,78:2253-2257); 
chaperones such as calnexin; and MHC-linked transporter 
proteins or analogs thereof (Powis et al . (1991) Nature 
354 : 528-531) . Immunomodulatory factors may also be 
0 agonists, antagonists, or ligands for these molecules. 

For example, soluble forms of receptors can often behave 
as antagonists for these types of factors, as can mutated 
forms of the factors themselves. 

Nucleic acid molecules that encode the above - 
5 described substances, as well as other nucleic acid 
molecules that are advantageous for use within the 
present invention, may be readily obtained from a variety 
of sources, including, for example, depositories such as 
the American Type Culture Collection, or from commercial 
D sources such as British Bio-Technology Limited (Cowley, 
Oxford England) . Representative examples include BBG 12 
(containing the GM-CSF gene coding for the mature protein 
of 12 7 amino acids) , BBG 6 (which contains sequences 
encoding gamma interferon), A.T.C.C. Deposit No. 39656 
5 (which contains sequences encoding TNF) , A.T.C.C. Deposit 
No. 20663 (which contains sequences encoding alpha- 
interf eron) , A.T.C.C. Deposit Nos. 31902, 31902 and 39517 
(which contain sequences encoding beta -interferon) , 
A.T.C.C. Deposit No. 67024 (which contains a sequence 
) which encodes Interleukin-lb) , A.T.C.C. Deposit Nos. 
39405, 39452, 39516, 39626 and 39673 (which contain 
sequences encoding Interleukin-2) , A.T.C.C. Deposit Nos. 
59399, 59398, and 67326 (which contain sequences encoding 
Interleukin-3) , A.T.C.C. Deposit No. 57592 (which 
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contains sequences encoding Interleukin-4 ) A T C C 
Deposit Nos. 59394 and 59395 (which contain sequences 
encoding Interleukin-5) , and A.T.C.C. Deposit No. 67153 
(whxch contains sequences encoding . Interleukin-6) . 

Plasmids containing cytokine genes or 
immunomodulatory genes (International Publication Nos wo 
94/02951 and WO 96/21015>can be digested with appropriate 
restriction enzymes, and DNA fragments containing the 

particular gene of interest- r~*r, k~ ,• _ ^ . 

interest can be inserted into a gene 

transfer vector using standard molecular biology 
techniques. (See, e.g., Sambrook et al . f supr£j ^ 
Ausubel.et al . ( ed s) Current Protocols in Molecular 
Biology, Greene Publishing and Wiley- Interscience) . 

Exemplary hormones, growth factors and other 
proteins which are useful for long term expression are 
described, for example, in European Publication No 
0437478B1, entitled "Cyclodextrin- Peptide Complexes « 
Nucleic acid sequences encoding a variety of hormones can 
be used, including those encoding human growth hormone 
insulin, calcitonin, prolactin, follicle stimulating 
hormone (FSH) , luteinizing hormone (LH) , human chorionic 
gonadotropin (HCG) , and thyroid stimulating hormone 
(TSH) . A variety of different forms of IGF-1 and IGF-2 
growth factor polypeptides are also well known the art 
and can be incorporated into gene transfer vectors for 
long term expression in vivo. See, e.g., European Patenfc 
No. 0123228B1, published for grant September 19 i 993 
entitled "Hybrid DNA Synthesis of Mature Insulin-like' 
Growth Factors." As an additional example, the long term 
in v.vo expression of different forms of fibroblast 
growth factor can also be effected employing the 
compositions and methods of invention. See, eg u s 
Patent Nos. 5,464,774, 5,155,214, and 4,994,559 for a ' 
description of different fibroblast growth factors 
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Polynucleotide sequences coding for the above - 
described molecules can be obtained using recombinant 
methods, such as by screening cDNA and genomic libraries 
from cells expressing the gene, or by deriving the gene 
5 from a vector known to include the same. For example, 
plasmids which contain sequences that encode altered 
cellular products may be obtained from a depository such 
as the A.T.C.C., or from commercial sources. Plasmids 
containing the nucleotide sequences of interest can be 

10 digested with appropriate restriction enzymes, and DNA 
fragments containing the nucleotide sequences can be 
inserted into a gene transfer vector using standard 
molecular biology techniques. 

Alternatively, cDNA sequences for use with the 

15 present invention may be obtained from cells which 
express or contain the sequences, using standard 
techniques, such as phenol extraction and PCR of cDNA or 
genomic DNA. See, e.g., Sambrook et al . , supra f for a 
description of techniques used to obtain and isolate DNA. 

20 Briefly, mRNA from a cell which expresses the gene of 
interest can be reverse transcribed with reverse 
transcriptase using oligo-dT or random primers. The 
single stranded cDNA may then be amplified by PCR (see 
U.S. Patent Nos . 4,683,202, 4,683,195 and 4,800,159, see 

25 also PCR Technology: Principles and Applications for DNA 
Amplification, Erlich (ed.), Stockton Press, 1989)) using 
oligonucleotide primers complementary to sequences on 
either side of desired sequences. 

The nucleotide sequence of interest can also be 

30 produced synthetically, rather than cloned, using a DNA 
synthesizer (e.g., an Applied Biosystems Model 392 DNA 
Synthesizer, available from ABI , Foster City, 
California) . The nucleotide sequence can be designed 
with the appropriate codons for the expression product 
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desired. The complete sequence is assembled from 
overlapping oligonucleotides prepared by standard methods 
and assembled into a complete coding sequence. See, 
e.g., Edge (1981) Mature 292 : 756; Nambair et al . (1984) 
5 Science 223:1299; Jay et al . (1984) J. Biol. Chem. 
259:6311. 

The synthetic expression cassettes of the present 
invention can be employed in the construction of 
packaging cell lines for use with retroviral vectors. 

One type of retrovirus, the murine leukemia virus, 
or "MLV", has been widely utilized for gene therapy 
applications (see generally Mann et al . (Cell 33:153, 
1993), Cane and Mulligan (Proc, Nat 1 !. Acad. Sci . USA 
81:6349, 1984), and Miller et al . , Human Gene 21erapy 
15 1:5-14,1990. 

Lentiviral vectors typically, comprise a 5' 
lentiviral LTR, a tRNA binding site, a packaging signal, 
a promoter operably linked to one or more genes of 
interest, an origin of second strand DNA synthesis and a 
3' lentiviral LTR, wherein the lentiviral vector contains 
a nuclear transport element. The nuclear transport 
element may be located either upstream (5') or downstream 
<3») of a coding sequence of interest. Within certain 
embodiments, the nuclear transport element is not RRE. 
Within one embodiment the packaging signal is an extended 
packaging signal. Within other embodiments the promoter 
is a tissue specific promoter, or, alternatively, a 
promoter such as CMV. Within other embodiments, the 
lentiviral vector further comprises an internal ribosome 
30 entry site. 

A wide variety of lentiviruses may be utilized 
within the context of the present invention, including 
for example, lentiviruses selected from the group 
consisting of HIV, HIV-l, HIV-2, FIV and SIV. 
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In one embodiment of the present invention synthetic 
Env and/or Gag-polymerase expression cassettes are 
provided comprising a promoter and a sequence encoding 
synthetic Gag-polymerase (SEQ ID NO: 6) and at least one 
5 of vpr, vpu, nef or vif . wherein the promoter is operably 
linked to Gag-polymerase and vpr, vpu, nef or vif. 

Within yet another aspect of the invention, host 
cells (e.g., packaging cell lines) are provided which 
contain any of the expression cassettes described herein. 
For example, within one aspect packaging cell line are 
provided comprising an expression cassette that comprises 
a sequence encoding synthetic Env and/or Gag-polymerase, 
and a nuclear transport element, wherein the promoter is 
operably linked to the sequence encoding Env and/or Gag- 
polymerase. Packaging cell lines may further comprise a 
promoter and a sequence encoding tat, rev, or an 
envelope, wherein the promoter is operably linked to the 
sequence encoding tat, rev, or, the envelope. The 
packaging cell line may further comprise a sequence 
encoding any one or more of nef, vif, vpu or vpr. 

In one embodiment, the expression cassette 
(carrying, for example, the synthetic Env, synthetic tat 
and/or synthetic Gag-polymerase) is stably integrated. 
The packaging cell line, upon introduction of a 
lentiviral vector, typically produces viral particles. 
The promoter regulating expression of the synthetic 
expression cassette may be inducible. Typically, the 
packaging cell line, upon introduction of a lentiviral 
vector, produces viral particles that are essentially 
free of replication competent virus. 

Packaging cell lines are provided comprising an 
expression cassette which directs the expression of a 
synthetic Env (or Gag-polymerase) gene, an expression 
cassette which directs the expression of a Gag (or Env) 
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gene optimized for expression (e.g., Andre, S., et al . , 
Journal of Virology 72 (2) : 1497-1503 , 1998; Haas, J., e t 
al., Current Biology 6(3) =315-324, 1996). A lentiviral 
vector is introduced into the packaging cell line to 
5 produce a vector particle producing cell line. 

As noted above, lentiviral vectors can be designed 
to carry or express a selected gene(s) or sequences of 
interest. Lentiviral vectors may be readily constructed 
from a wide variety of lentiviruses (see RNA Tumor 
Viruses, Second Edition, Cold Spring Harbor Laboratory, 
1985) . Representative examples of lentiviruses included 
HIV, HIV-1, HIV-2, FIV and SIV. Such lentiviruses may 
either be obtained from patient isolates, or, more 
preferably, from depositories or collections such as the 
American Type Culture Collection, or isolated from known 
sources using available techniques. 

Portions of the lentiviral gene delivery vectors (or 
vehicles) may be derived from different viruses. For 
example, in a given recombinant lentiviral vector, LTRs 
may be derived from an HIV, a packaging signal from SIV, 
and an origin of second strand synthesis from HrV-2. 
Lentiviral vector constructs may comprise a 5« lentiviral 
LTR, a tRNA binding site, a packaging signal, one or more 
heterologous sequences, an origin of second strand DNA 
synthesis and a 3 • LTR, wherein said lentiviral vector 
contains a nuclear transport element that is not RRE. 

Briefly, Long Terminal Repeats ("LTRs") are 
subdivided into three elements, designated U5, R and U3 . 
These elements contain a variety of signals which are 
responsible for the biological activity of a retrovirus, 
including for example, promoter and enhancer elements 
which are located within U3 . LTRs may be readily 
identified in the provirus (integrated DNA form) due to 
their precise duplication at either end of the genome. 
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As utilized herein, a 5' LTR should be understood to 
include a 5- promoter element and sufficient LTR sequence 
to allow reverse transcription and integration of the DNA 
form of the vector. The 3- LTR should be understood to 
i include a polyadenylation signal, and sufficient LTR 
sequence to allow reverse transcription, and integration 
of the DNA form of the vector. 

The tRNA binding site and origin of second strand 
DNA synthesis are also important for a retrovirus to be 
biologically active, and may be readily identified by one 
of skill in the art. For example, retroviral tRNA binds 
to a tRNA binding site by Watson-Crick base pairing, and 
is carried with the retrovirus genome into a viral 
particle. The tRNA is then utilized as a primer for DNA 
synthesis by reverse transcriptase. The tRNA binding 
site may be readily identified based upon its location 
just downstream from the 5 'LTR. Similarly, the origin of 
second strand DNA synthesis is, as its name implies, 
important for the second strand DNA synthesis of a 
retrovirus. This region, which is also referred to as 
the poly-purine tract, is located just upstream of the 
3 • LTR . 

In addition to a 5' and 3 ' LTR, tRNA binding site, 
and origin of second strand DNA synthesis, recombinant 
retroviral vector constructs may also comprise a 
packaging signal, as well as one or more genes or coding 
sequences of interest. In addition, the lentiviral 
vectors have a nuclear transport element which, in 
preferred embodiments is not RRE . Representative 
examples of suitable nuclear transport elements include 
the element in Rous sarcoma virus (Ogert, et al . , j viroL 
70, 3834-3843, 1996), the element in Rous sarcoma virus 
(Liu & Mertz, Genes & Dev., 9, 1766-1789, 1995) and the 
element in the genome of simian retrovirus type I 
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(Zolotukhin, et al . , J Virol. 68, 7944-7952, 1994). 
Other potential elements include the elements in the 
histone gene (Kedes, Annu. Rev. Biochem. 48, 837-870, 
1970), the a-interferon gene (Nagata et al . , Mature 287, 
5 401-408, 1980), the (^-adrenergic receptor gene (Koilka, 
et al.. Nature 329, 75-79, 1987), and the c-Jun gene 
(Hattorie, et al., Proc. Natl. Acad. Sci . USA 85, 
9148-9152, 1988) . 

r 

Recombinant lentiviral vector constructs typically 
10 lack both Gag -polymerase and env coding sequences. 

Recombinant lentiviral vector typically contain less than 
20, preferably 15, more preferably 10, and most 
preferably 8 consecutive nucleotides found in Gag- 
polymerase or env genes. One advantage of the present 
15 invention is that the synthetic Gag -polymerase expression 
cassettes, which can be used to construct packaging cell 
lines for the recombinant retroviral vector constructs, 
have little homology to wild-type Gag -polymerase 
sequences and thus considerably reduce or eliminate the 
possibility of homologous recombination between the 
synthetic and wild- type sequences. 

Lentiviral vectors may also include tissue.-specif ic 
promoters to drive expression of one or more genes or 
sequences of interest. For example, lentiviral vector 
particles of the invention can contain a liver specific 
promoter to maximize the potential for liver specific 
expression of the exogenous DNA sequence contained in the 
vectors. Preferred liver specific promoters include the 
hepatitis B X-gene promoter and the hepatitis B core 
protein promoter. These liver specific promoters are 
preferably employed with their respective enhancers. The 
enhancer element can be linked at either the 5' or the 3' 
end of the nucleic acid encoding the sequences of 
interest. The hepatitis B X gene promoter and its 
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enhancer can be obtained from the viral genome as a 332 
base pair EcoRV-Ncol DNA fragment employing the methods 
described in Twu, et al . , J Virol. 61:3448-3453, 1987. 
The hepatitis B core protein promoter can be obtained 
5 from the viral genome as a 584 base pair BamHI-Bglll DNA 
fragment employing the methods described in Gerlach,et 
al., Virol 189:59-66, 1992. It may be necessary to 
remove the negative regulatory sequence in the BamHI- 
Bglll fragment prior to inserting it. Other liver 
10 specific promoters include the AFP (alpha fetal protein) 
gene promoter and the albumin gene promoter, as disclosed 
in EP -Patent Publication 0 415 731, the -1 antitrypsin 
gene promoter, as disclosed in Rettenger, et al . , Proc. 
Natl. Acad. Sci. 91:1460-1464, 1994, the fibrinogen 
15 gene promoter, the APO-A1 (Apolipoprotein Al) gene 

promoter, and the promoter genes for liver transference 
enzymes such as, for example, SGOT, SGPT and glutamyle 
transferase. See also PCT Patent Publications WO 
90/07936 and WO 91/02805 for a description of the use of 
20 liver specific promoters in lentiviral vector particles. 

Lentiviral vector constructs may be generated such 
that more than one gene of interest is expressed. This 
may be accomplished through the use of di- or oligo- 
cistronic cassettes (e.g., where the coding regions are 
25 separated by 80 nucleotides or less, see generally Levin 
et al.,, Gene 108:167-174, 1991), or through the use of 
Internal Ribosome Entry Sites ("IRES"). 

Packaging cell lines suitable for use with the above 
described recombinant retroviral vector constructs may be 
30 readily prepared given the disclosure provided herein. 
Briefly, the parent cell line from which the packaging 
cell line is derived can be selected from a variety of 
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mammalian cell lines, including for example, 293, RD, 
COS- 7, CHO, BHK, VERO, HT1080, and myeloma cells. 

After selection of a suitable host cell for the 
generation of a packaging cell line, one or more 
> expression cassettes are introduced into the cell line in 
order to complement or supply in trans components of the 
vector which have been deleted. 

Representative examples of suitable expression 
cassettes have been described herein and include 
synthetic Env, tat, Gag, synthetic Gag-protease, 
synthetic Gag-reverse transcriptase and synthetic Gag- 
polymerase expression cassettes, which comprise a 
promoter and a sequence encoding, e.g., Env, tat, or Gag- 
polymerase and at least one of vpr, vpu, ne f or vif, 
wherein the promoter is operably linked to Env, tat or 
Gag -polymerase and vpr, vpu, nef or vif. As described 
above, optimized Env, Gag and/or tat coding sequences may 
also be utilized in various combinations in the 
generation of packaging cell lines. ^ 

Utilizing the above-described expression cassettes, 
a wide variety of packaging cell lines can be generated! 
For example, within one aspect packaging cell li ne are 
provided comprising an expression cassette that comprises 
a sequence encoding synthetic HIV (e.g.. Gag, Env, tat, 
Gag-polymerase, Gag-reverse transcriptase or Gag- 
protease) polypeptide, and a nuclear transport element, 
wherein the promoter is operably linked to the sequence 
encoding the HIV polypeptide. Within other aspects, 
packaging cell lines are provided comprising a promoter 
and a sequence encoding Gag, tat, rev, or an envelope 
(e.g., HIV env), wherein the promoter is operably linked 
to the sequence encoding Gag, tat, rev, or, the envelope. 
Within further embodiments, the packaging cell line may 
comprise a sequence encoding any one or more of nef, vif, 
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vpu or vpr.. For example, the packaging cell line may- 
contain only nef , vif, vpu, or vpr alone, nef and vif , 
nef and vpu, nef and vpr, vif and vpu, vif and vpr, vpu 
and vpr, nef vif and vpu, nef vif and vpr, nef vpu and 
5 vpr, wir vpu and vpr, or, all four of nef vif vpu and 
vpr. 

In one embodiment, the expression cassette is stably 

integrated. Within another embodiment, the packaging 

cell line, upon introduction of a lentiviral vector, 
10 produces, particles . Within further embodiments the 

promoter is inducible. Within certain preferred 

embodiments of the invention, the packaging cell line, 
. upon introduction of a lentiviral vector, produces 

particles that are free of replication competent virus. 
15 The synthetic cassettes containing optimized coding 

sequences are transfected into a selected cell line. 

Transfected cells are selected that (i) carry, typically, 

integrated, stable copies of the Gag, Pol, and Env coding 

sequences, and (ii) are expressing acceptable levels of 
20 these polypeptides (expression can be evaluated by 

methods known in the prior art, e.g., see Examples 1-4). 

The ability of the cell line to produce VLPs may also be 

verified (Examples 6, 7 and 15) . 

A sequence of interest is constructed into a 
25 suitable viral vector as discussed above. This defective 

virus is then transfected into the packaging cell line. 

The packaging cell line provides the viral functions 
• necessary for producing virus-like particles into which 

the defective viral genome, containing the sequence of 
30 interest, are packaged. These VLPs are then isolated and 

can be used, for example, in gene delivery or gene 

therapy. 

Further, such packaging cell lines can also be used 
to produce VLPs alone, which can, for example, be used as 
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adjuvants for administration with other antigens or in 
vaccine compositions. Also, co-expression of a selected 
sequence of interest encoding a polypeptide (for example, 
an antigen) in the packaging cell line can also result in 
the entrapment and/or association of the selected 
polypeptide in/with the VLPs. 
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2 • 3 DNA Immunization and Gene Delivery 

A variety of polypeptide antigens can be used in the 
practice of the present invention. Polypeptide antigens 
can.be included in DNA immunization constructs 
containing, for example, any of the synthetic expression 
cassettes described herein fused in-frame to a coding 
sequence for the polypeptide antigen, where expression of 
15 the construct results in VLPs presenting the antigen of 

interest. Antigens can be derived from a wide variety of 
viruses, bacteria, fungi, plants, protozoans and other 
parasites. For example, the present invention will find 
use for stimulating an immune response against a wide 
variety of proteins from the herpesvirus family, 
including proteins derived from herpes simplex virus 
(HSV) types 1 and 2, such as HSV-1 and HSV-2 gB, gD, gH, 
VP16 and VP22; antigens derived from varicella zoster 
virus (VZV) , Epstein-Barr virus (EBV) and cytomegalovirus 
(CMV) including CMV gB and gH; and antigens derived from 
other human herpesviruses such as HHV6 and HHV7 . (See, 
e.g. Chee et al . , Cytomegaloviruses (J.K. McDougall, 
ed., Springer-Verlag 1990) pp. 125-169, for a review of 
the protein coding content of cytomegalovirus; McGeoch et 
al., J. Gen. Virol. (1988) 69:1531-1574, for a discussion 
of the various HSV-1 encoded proteins; U.S. Patent No. 
5,171,568 for a. discussion of HSV-1 and HSV-2 gB and gD 
proteins and the genes encoding therefore; Baer et al 
Wature (1984) 310:207-211, for the identification of 
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protein coding sequences in an EBV genome; and Davison 
and Scott, j. Gen. Virol. (1986) 67:1759-1816, for a 
review of V2V. ) 

Additionally, immune responses to antigens from the 
5 hepatitis family of viruses, including hepatitis A virus 
(HAV) , hepatitis B virus (HBV) , hepatitis C virus (HCV) , 
the delta hepatitis virus (HDV) , hepatitis E virus (HEV) , 
and hepatitis G virus, can also be stimulated using the 
constructs of the present invention. By way of example, 
10 the HCV genome encodes several viral proteins, including 
El (also known as E) and E2 (also known as E2/NSI) , which 
will find use with the present invention (see, Houghton 
et al. : Hepatol ogy (1991) 14:381-388, for a discussion of 
HCV proteins, including El and E2) . The 5 -antigen from 
15 HDV can also be used (see, e.g., U.S. Patent No. 
5,389,528, for a description of the 6-antigen) . 

Similarly, influenza virus is another example of a 
virus for which the present invention will be 
particularly useful. Specifically, the envelope 
20 glycoproteins HA and NA of influenza A are of particular 
interest for generating an immune response. Numerous HA 
subtypes of influenza A have been identified (Kawaoka et 
al., Virology (1990) 179:759-767; Webster et al. 
"Antigenic variation among type A influenza viruses," p. 
25 127-168. In: P. Palese and D.W. Kingsbury (ed.), Genetics 
of influenza viruses. Springer- Verlag, New York) . 

Other antigens of particular interest to be used in 
the practice of the present invention include antigens 
and polypeptides derived therefrom from human 
30 papillomavirus (HPV) , such as one or more of the various 
early proteins including E6 and E7; tick-borne 
encephalitis viruses; and HIV-1 (also known as HTLV-IH, 
LAV, ARV, etc.), including, but not limited to, antigens 
such as gpl20, gp41, gpl60, Gag and pol from a variety of 
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isolates including, but not limited to, HIV IIIb , H IV 
HI v-i smj , HIV-1 SP170 , hi Vlav , HI Vlaj> HIV^, HIV-1^,, hiv- 
1„«. other HIV-l strains from diverse subtypes (e.g 
subtypes, A through G, and 0), HIV-2 strains and diverse 
' subtypes (e.g., H IV- 2uci and HIV-2 UCJ ) . See, e.g., MyerS( 
et al., Los Alamos Database, Los Alamos National 
Laboratory, Los Alamos, New Mexico; Myers, et al . , Human 
Retroviruses and Aids, 1990, Los Alamos, New Mexico: Los 
Alamos National Laboratory. 

Proteins derived from other viruses will also find 
use in the claimed methods, such as without limitation 
proteins from members of the families Picornaviridae 
(e.g., poxviruses, etc.); Caliciviridae; Togaviridae 
(e.g., rubella virus, dengue virus, etc.); Flaviviridae • 
Coronaviridae; Reoviridae; Birnaviridae; Rhabodoviridae ' 
(e.g., rabies virus, etc.); Filoviridae; Paramyxoviridae 
(e.g., mumps virus, measles virus, respiratory syncytial 
virus, etc.); Orthomyxoviridae (e.g., influenza virus 
types A, B and C, etc.); Bunyaviridae; Arenaviridae • 
Retroviradae, e.g., HTLV-I; HTLV-II; HIV-l; HIV-2; simian 
immunodeficiency virus (SIV) among others. See, e g 
Virology, 3rd Edition (W.K. Joklik ed. 1988); Fundamental 
Urology, 2nd Edition (B.N. Fields and D.M. Knipe, eds 
1991; Virology, 3rd Edition (Fields, BN, DM Knipe,' PM - 
Howley, Editors, 1996, Lippincott- Raven, Philadelphia, 
PA) for a description of these and other viruses. 

Particularly preferred bacterial antigens are 
derived from organisms that cause diphtheria, tetanus 
pertussis, meningitis, and other pathogenic states 
including, without limitation, antigens derived from 
Corynebacterium diphtheriae, Clostridium tetani 
Bordetella pertusis, Neisseria meningitidis, including 
serotypes Meningococcus A, B, C, Y and WI35 (MenA B c 
Y and WI35), Haemophilus influenza type B (Hib) , and' 
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Helicobacter pylori. Examples of parasitic antigens 
include those derived from organisms causing malaria, 
tuberculosis, and Lyme disease. 

Furthermore, the methods described herein provide 
> means for treating a variety of malignant cancers. For 
example, the system of the present invention can be used 
to enhance both humoral and cell -mediated immune 
responses to particular proteins specific to a cancer in 
question, such as an activated oncogene, a fetal antigen, 
or an activation marker. Such tumor antigens include any 
of the various MAGEs (melanoma associated antigen E) , 
including MAGE 1, 2, 3, 4, etc. (Boon, T. Scientific 
American (March 1993) : 82-89) ; any of the various 
tyrosinases; MART 1 (melanoma antigen recognized by T 
cells), mutant ras ; mutant p53; p97 melanoma antigen; CEA 
(carcinoembryonic antigen), among others. 

DNA immunization using synthetic expression 
cassettes of the present invention has been demonstrated 
to be efficacious (Examples 8 and 10-12). Animals were 
immunized with both the synthetic expression cassette and 
the wild type expression cassette. The results of the 
immunizations with plasmid-DNAs showed that the synthetic 
expression cassettes provide a clear improvement of 
immunogenicity relative to the native expression 
cassettes. Also, the second boost immunization induced a 
secondary immune response, for example after two to eight 
weeks. Further, the results of CTL assays showed 
increased potency of synthetic expression cassettes for 
induction of cytotoxic T- lymphocyte (CTL) responses by 
30 DNA immunization. 

It is readily apparent that the subject invention 
can be used to mount an immune response to a wide variety 
of antigens and hence to treat or prevent a large number 
of diseases. 
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2.3.1 DELIVERY OF. THE SYNTHETIC EXPRESSION CASSETTES OF THE 

PRESENT INVENTION 

Polynucleotide sequences coding for the above - 
described molecules can be" obtained using recombinant 
5 methods, such as by screening cDNA and genomic libraries 
from cells expressing the gene, or by deriving the gene 
from a vector known to include the same. The sequences 
can be analyzed by conventional sequencing techniques 
Furthermore, the desired gene can be isolated directly 
from cells and tissues containing the same, using 
standard techniques, such as phenol extraction and PGR of 
cDNA or genomic DMA. See, e.g., Sambrook et al . , supra, 
for. a description of techniques used to obtain, isolate' 
and sequence DNA. Once the sequence is known, the gene 
of interest can also be produced synthetically, rather 
than cloned. The nucleotide sequence can be designed 
with the appropriate codons for the particular amino acid 
sequence desired. m general, one will select preferred 
codons for the intended host in which the sequence will 
be expressed. The complete sequence is 

assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding 
sequence. See, e.g., Edge, Nature (1981) 222:756; 
Nambair et al., Science (1984) 223:1299; Jay et al . , J. 
Biol. Chem. (1984) 259:6311; Stemmer, W.P.C., (1995)' Gene 
164:49-53. 

Next, the gene sequence encoding the desired antigen 
can be inserted into a vector containing a synthetic 
expression cassette of the present invention (e.g., see 
Example 1 for construction of various exemplary synthetic 
expression cassette) . The antigen is inserted into the 
synthetic coding sequence such that when the combined 
sequence is expressed it results in the production of 
VLPs comprising the polypeptide and/or the antigen of 
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interest. Insertions can be made within the Gag coding 
sequence or at either end of the coding sequence (5', 
amino terminus of the expressed polypeptide; or 3 1 , 
carboxy terminus of the expressed polypeptide e.g., 
see Example 1) (Wagner, R., et al . , Arch Virol. 127:117- 
137, 1992; Wagner, R. , et al . , Virology 200 : 162 -175 , 
1994; Wu, X., et al . , J. ViroJ. 69 (6) : 3389-3398 , 1995; 
Wang, C-T., et al . , Virology 200 : 524-534 , 1994; Chazal, 
N., et al., Virology 68(1) : 111-122, 1994; Griffiths, 
J.C., et al., J. Virol. 67 (6) : 3191-3198 , 1993; Reicin, 
A.S., .et al., J . ViroJ. 69 (2) : 642-650, 1995). 

Up to 50% of the coding sequences of p55Gag can be 
deleted without affecting the assembly to virus-like 
particles and expression efficiency (Borsetti, A., et al, 
15 J. Virol. 72(11) :9313-9317, 1998; Gamier, L . , et al . , J 
Virol 72 (6) :4667-4677, 1998; Zhang, Y. , et al . , J Virol 
72(3) :1782-1789, 1998; Wang, C, et al . , J" Virol 72(10): 
7950-7959, 1998) . In one embodiment of the present 
invention, immunogenicity of the high level expressing 
20 synthetic p55GagMod and p55GagProtMod expression 

cassettes can be increased by the insertion of different 
structural or non-structural HIV antigens, multiepitope 
cassettes, or cytokine sequences into deleted, mutated or 
truncated regions of p55GagMod sequence. In another 
25 embodiment of the present invention, immunogenicity of 
the high level expressing synthetic Env expression 
cassettes can be increased by the insertion of different 
structural or non- structural HIV antigens, multiepitope 
cassettes, or cytokine sequences into deleted regions of 
gpl20Mod, gpl40Mod or gpl60Mod sequences. Such deletions 
may be generated following the teachings of the present 
invention and information available to one of ordinary 
skill in the art. One possible advantage of this 
approach, relative to using full-length modified Env 
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sequences fused to heterologous polypeptides, can be 
higher expression/secretion efficiency and/or higher 
immunogenicity of the expression product . Such deletions 
may be generated following the teachings of the present 
> invention and information available to one of ordinary 
skill in the art. One possible advantage of this 
approach, relative to using full-length Env, Gag or Tat 
sequences fused to heterologous polypeptides, can be 
higher expression/secretion efficiency and/or 
immunogenicity of the expression product. 

When sequences are added to the amino terminal end 
of Gag (for example, when using the synthetic P 55GagMod 
expression cassette of the present invention) , the 
polynucletide can contain coding sequences at the 5' end 
that encode a signal for addition of a myristic moiety to 
the Gag-containing polypeptide (e.g., sequences that 
encode Met-Gly) . 

The ability of Gag-containing polypeptide constructs 
to form VLPs can be empirically determined following the 
teachings of the present specification. 

HIV polypeptide/antigen synthetic expression 
cassettes include control elements operably linked to the 
coding sequence, which allow for the expression of the 
gene in vivo in the subject species. For example, 
typical promoters for mammalian cell expression include 
the SV40 early promoter, a CMV promoter such as the CMV 
immediate early promoter, the mouse mammary tumor virus 
LTR promoter, the adenovirus major late promoter (Ad 
MLP) , and the herpes simplex virus promoter, among 
others, other nonviral promoters, such as a promoter 
derived from the murine metallothionein gene, will also 
find use for mammalian expression. Typically, 
transcription termination and polyadenylation' sequences 
wall also be present, located 3- to the translation stop 



92 



WO 00/39302 



PCT7US99/3I245 



10 



codon. Preferably, a sequence for optimization of 
initiation of translation, located 5' to the coding 
sequence, is also present. Examples of transcription 
terminator/polyadenylation signals include those derived 
from SV40, as described in Sambrook et al . , supra, as 
well as a bovine growth hormone terminator sequence. 

Enhancer elements may also be used herein to 
increase expression levels of the mammalian constructs. 
Examples include the SV40 early gene enhancer, as 
described in Dijkema et al . , EMBO J. (1985) 4:761, the 
enhancer/promoter derived from the long terminal repeat 
(LTR) ?of the Rous Sarcoma Virus, as described in Gorman 
et air, Proc . Natl. Acad. Sci. USA (1982b) 79:6777 and 
elements derived from human CMV, as described in Boshart 
15 et al., Cell (1985) 41:521, such as elements included in 
the CMV intron A sequence . 

Furthermore, plasmids can be constructed which 
include a chimeric antigen-coding gene sequences, 
encoding, e.g., multiple ant igens/epi topes of interest, 
for example derived from a single or from more than one 
viral isolate. 

Typically the antigen coding sequences precede or 
follow the synthetic coding sequences and the chimeric 
transcription unit will have a single open reading frame 
25 encoding both the antigen of interest and the synthetic 
Gag coding sequences. Alternatively, multi -cistronic 
cassettes (e.g., bi -cistronic cassettes) can be 
constructed allowing expression of multiple antigens from 
a single mRNA using the EMCV IRES , or the like. Lastly, 
antigens can be encoded on separate transcripts from 
independent promoters on a single plasmid or other 
vector . 

Once complete, the constructs are used for nucleic 
acid immunization or the like using standard gene 
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delivery protocols. Methods for gene delivery are known 
in the art. See, e.g., U.S. Patent Nos . 5,399,346, 
5,580,859, 5,589,466. Genes can be delivered either 
directly to the vertebrate subject or, alternatively, 
i. delivered ex vivo, to cells derived from the subject and 
the cells reimplanted in the subject. 

A number of viral based systems have been developed 
for gene transfer into mammalian cells. For example, 
retroviruses provide a convenient platform for gene 
delivery systems. Selected sequences can be inserted 
into a vector and packaged in retroviral particles using 
techniques known in the art. The recombinant virus can 
then be isolated and delivered to cells of the subject 
either in vivo or ex vivo. A number of retroviral 
systems have been described (U.S. Patent No. 5,219,740; 
Miller and Rosman, BioTechnigues (1989) 7:980-990; 
Miller, A.D., Human Gene Therapy (1990) 1:5-14; Scarpa et 
al.. Virology (1991) i£p_:849-852; Burns et al . , Proc. 
Natl. Acad. Sci. USA (1993) 90:8033-8037; and Boris- 
Lawrie and Temin, Cur. Opin. Genet. Develop. (1993) 
3:102-109. 

A number of adenovirus vectors have also been 
described. Unlike retroviruses which integrate into the 
host genome, adenoviruses persist extrachromosomally thus 
minimizing the risks associated with insertional 
mutagenesis (Haj -Ahmad and Graham, J. Virol. (1986) 
57:267-274; Bet.t et al . , J. Virol. (1993) 67:5911-5921; 
Mittereder et al . , Human Gene Therapy (1994) 5:717-729; 
Seth et al., J. Virol. (1994) 68:933-940; Barr et al . , 
Gene Therapy (1994) 1:51-58; Berkner, K.L. Biorechnigues 
(1988) 6:616-629; and Rich et al . , Human Gene Therapy 
(1993) 4 :461-476) . 

Additionally, various adeno-associated virus (AAV) 
vector systems have been developed for gene delivery. 
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AAV vectors can be readily constructed using techniques 
well known in the art. See, e.g., U.S. Patent Nos. 
5,173,414 and 5,139,941; International Publication Nos. 
WO 92/01070 (published 23 January 1992) and WO 93/03769 
5 (published 4 March 1993); Lebkowski et al . , Molec . Cell. 
Biol. (1988) 8:3988-3996; Vincent et al . , Vaccines 90 
(1990^) (Cold Spring Harbor Laboratory Press) ; Carter, 
B.J. Current Opinion in Biotechnology (1992) 2:533-539; 
Muzyczka, N. Current Topics in Microbiol, and Immunol. 
10 (1992) 158:97-129; Kotin, R.M. Human Gene Therapy (1994) 
5:793-801; Shelling and Smith, Gene Therapy (1994) 1:165 
169; and Zhou et al . , J. Exp. Med. (1994) 122:1867-1875. 

Another vector system useful for delivering the 
polynucleotides of the present invention is the 
15 ^enterically administered recombinant poxvirus vaccines 
described by Small, Jr., P. A., et al . (U.S. Patent No. 
5,676,950, issued October 14, 1997). 

Additional viral vectors which will find use for 
delivering the nucleic acid molecules encoding the 
20 antigens of interest include those derived from the pox 
family of viruses, including vaccinia virus and avian 
poxvirus. By way of example, vaccinia virus recombinant f 
expressing the genes can be constructed as follows. The 
DNA encoding the particular synthetic Gag/antigen coding 
25 sequence is first inserted into an appropriate vector so 
that it is adjacent to a vaccinia promoter and flanking 
vaccinia DNA sequences, such as the sequence encoding 
thymidine kinase (TK) . This vector is then used to 
transfect cells which are simultaneously infected with 
10 vaccinia. Homologous recombination serves to insert the 
vaccinia promoter plus the gene encoding the coding 
sequences of interest into the viral genome. The 
resulting TK" recombinant can be selected by culturing the 
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cells in the presence of 5-bromodeoxyuridine and picking 
viral plaques resistant thereto. 

Alternatively, avipoxviruses, such as the fowlpox 
and canarypox viruses, can also be used to deliver the 
5 genes. Recombinant avipox viruses, expressing immunogens 
from mammalian pathogens, are known to confer protective 
immunity when administered to non-avian species. The use 
of an avipox vector is particularly desirable in human 
and other mammalian species since members of the avipox 
10 genus can only productively replicate in susceptible 
avian species and therefore are not infective in 
mammalian cells. Methods for producing recombinant 
avipoxviruses are known in the art and employ genetic 
recombination, as described above with respect to the 
15 production of vaccinia viruses. See, e.g., WO 91/12882; 
WO 89/03429; and WO 92/03545. 

Molecular conjugate vectors, such as the adenovirus 
chimeric vectors described in Michael et al . , J. Biol. 
Chem. (1993) 268:6866-6869 and Wagner et al . , Proc. Natl. 
Acad. Sci. USA (1992) 89_: 6099-6103 , can also be used for 
gene delivery. 

Members of the Alphavirus genus, such as, but not 
limited to, vectors derived from the Sindbis, Semliki 
Forest, and Venezuelan Equine Encephalitis viruses, will 
also find use as viral vectors for delivering the 
polynucleotides of the present invention (for example, a 
synthetic Gag- or Env- polypeptide encoding expression 
cassette as described in Example 14 below) . For a 
description of Sindbis-virus derived vectors useful for 
the practice of the instant methods, see, Dubensky et 
al., J. Virol. (1996) 70:508-519; and International 
Publication Nos. WO 95/07995 and WO 96/17072; as well as, 
Dubensky, Jr., T.W. , et al . , U.S. Patent No. 5,843,723, 
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issued December 1, 1998, and Dubensky, Jr., T.W., U.S. 
Patent No. 5,789,245, issued August 4, 1998. 

A vaccinia based inf ection/transfection system can 
be conveniently used to provide for inducible, transient 
5 expression of the coding sequences of interest (for 

example, a synthetic Gag/HCV-core expression cassette) in 
a host cell. In this system, cells are first infected in 
vitro with a vaccinia virus recombinant that encodes the 
bacteriophage T7 RNA polymerase. This polymerase 
10 displays exquisite specificity in that it only 

transcribes templates bearing T7 promoters. Following 
infection, cells are transfected with the polynucleotide 
of interest, driven by a T7 promoter. The polymerase 
expressed in the cytoplasm from the vaccinia virus 
15 recombinant transcribes the transfected DNA into RNA 
which is then translated into protein by the host 
translational machinery.' The method provides for high 
level, transient, cytoplasmic production of large 
quantities of RNA and its translation products. See, 
e.g., Elroy-Stein and Moss, Proc. Natl. Acad. Sci. USA 
(1990) 87:6743-6747; Fuerst et al . , Proc . Natl. Acad. 
Sci. USA (1986) 83:8122-8126. 

As an alternative approach to infection with 
vaccinia or avipox virus recombinants, or to the delivery 
25 of genes using other viral vectors, an amplification 
system can be used that will lead to high level 
expression following introduction into host cells. 
Specifically, a T7 RNA polymerase promoter preceding the 
coding "region for T7 RNA polymerase can be engineered. 
Translation of RNA derived from this template will 
generate T7 RNA polymerase which in turn will transcribe 
more template. Concomitantly, there will be a cDNA whose 
expression is under the control of the T7 promoter. 
Thus, some of the T7 RNA polymerase generated from 
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translation of the amplification template RNA will lead 
to transcription of the desired gene. Because some T7 
RNA polymerase is required to initiate the amplification, 
T7 RNA polymerase can be introduced into cells along with 
5 the template (s) to prime the transcription reaction. The 
polymerase can be introduced as a protein or on a plasmid 
encoding the RNA polymerase. For a further discussion of 
T7 systems and their use for transforming cells, see, 
e.g., International Publication No. WO 94/26911; Studier 
10 and Moffatt, J . Mol. Biol. (1986) 189:113-130; Deng and 
Wolff, Gene (1994) 143:245-249; Gao et al . , Biochem. 
Biophys. Res. Commun. (1994) 200:1201-1206; Gao and 
Huang, Nuc. Acids Res. (1993) 21:2867-2872; Chen et al . , 
Nuc. Acids Res. (1994) 22:2114-2120; and U.S. Patent No. 
15 5,135,855. 

The synthetic expression cassette of interest can 
also be delivered without a viral vector. For example, 
the synthetic expression cassette can be packaged as DNA 
or RNA in liposomes prior to delivery to the subject or 
20 to cells derived therefrom. Lipid encapsulation is 

generally accomplished using liposomes which are able to 
stably bind or entrap and retain nucleic acid. The ratio 
of condensed DNA to lipid preparation can vary but will 
generally be around 1:1 (mg DNA:micromoles lipid), or 
25 more of lipid. For a review of the use of liposomes as 
carriers for delivery of nucleic acids, see, Hug and 
Sleight, Biochim. Biophys. Acta. (1991) 1097 : 1-17 : 
Straubinger et al . , in Methods of Enzymology (1983), 
Vol. 101, pp. 512-527. 
J0 Liposomal preparations for use in the present 

invention include cationic (positively charged) , anionic 
(negatively charged) and neutral preparations, with 
cationic liposomes particularly preferred. Cationic 
liposomes have been shown to mediate intracellular 
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delivery of plasmid DNA (Feigner et al . , Proc. Natl. 
Acad. Sci. USA (1987) 84:7413-7416); mRNA (Malone et al . , 
Proc. Natl. Acad. Sci . USA (1989) 86:6077-6081); and 
purified transcription factors (Debs et al . , J. Biol. 
Chew. (1990) 265:10189-10192), in functional form. 

Cationic liposomes are readily available. For 
example, N[l-2 , 3 -dioleyloxy) propyl] -N, N, N-triethyl - 
ammonium .(DOTMA) liposomes are available under the 
trademark Lipofectin, from GIBCO BRL, Grand Island, NY. 
(See, also, Feigner et al., Proc. Natl. Acad. Sci. USA 
(1987) 84:7413-7416). Other commercially available 
lipids include (DDAB/DOPE) and DOTAP/DOPE (Boerhinger) . 
Other cationic liposomes can be prepared from readily 
available materials using techniques well known in the 
art. See, e.g., Szoka et al . , Proc. Natl. Acad. Sci. USA 
(1978) 75:4194-4198; PCT Publication No. WO 90/11092 for 
a description of the synthesis of DOTAP (1,2- 
bis (oleoyloxy) -3- (trimethylammonio) propane) liposomes. 

Similarly, anionic and neutral liposomes are readily 
2 0 available, such as, from Avanti Polar Lipids (Birmingham, 
AL) , or can be easily prepared using readily available 
materials. Such materials include phosphatidyl choline, 
cholesterol, phosphatidyl ethanolamine, 

dioleoylphosphatidyl choline (DOPC) , dioleoylphosphatidyl 
25 glycerol (DOPG) , dioleoylphoshatidyl ethanolamine (DOPE) , 
among others. These materials can also be mixed with the 
DOTMA and DOTAP starting materials in appropriate ratios. 
Methods for making liposomes using these materials are 
well known in the art. 
30 T he liposomes can comprise multilammelar vesicles 

(MLVs) , small unilamellar vesicles (SUVs) , or large 
unilamellar vesicles (LUVs) . The various liposome- 
nucleic acid complexes are prepared using methods known 
in the art. See, e.g., Straubinger et al. , in METHODS OF 



99 



WO 00/39302 PCT/US99/31245 

IMMUNOLOGY (1983), Vol. 101, pp. 512-527; Szoka et al . , 
Proc. Natl. Acad. Sci. USA (1978) 75:4194-4198; 
Papahadjopoulos et al . , Biochim. Biophys. Acta (1975) 
394:483; Wilson et al . , Cell (1979) 17:77); Deamer and 
5 Bangham, Biochim. Biophys. Acta (1976) 443 : 629: Ostro et 
al., Biochem. Biophys. Res. Commun. (1977) 76:836; Fraley 
et al., Proc. Natl. Acad. Sci. USA (1979) 76:3348); Enoch 
and Strittmatter, Proc. Natl. Acad. Sci. USA (1979) 
76:145); Fraley et al . , J. Biol. Chem. (1980) 255:10431; 
10 Szoka and Papahadjopoulos, Proc. Natl. Acad. Sci. USA 

(1978) 75:145; and Schaef er-Ridder et al . , Science (1982) 
215:166. 

The DNA and/or protein antigen (s) can also be 
delivered in cochleate lipid compositions similar to 
15 those described by Papahadjopoulos et al . , Biochem. 
Biophys. Acta. (1975) 394:483-491. See, also, U.S. 
Patent Nos. 4,663,161 and 4,871,488. 

The synthetic expression cassette of interest (e.g., 
any of the synthetic expression cassettes described in 
Example 1) may also be encapsulated, adsorbed to, or 
associated with, particulate carriers. Such carriers 
present multiple copies of a selected antigen to the 
immune system and promote migration, trapping and 
retention of antigens in local lymph nodes. The 
25 particles can be taken up by profession antigen 

presenting cells such as macrophages and dendritic cells, 
and/or can. enhance antigen presentation through other 
mechanisms such as stimulation of cytokine release. 
Examples of particulate carriers include those derived 
30 from polymethyl methacrylate polymers, as well as 
microparticles derived from poly (lact ides) and 
poly(lactide-co-glycolides) , known as PLC See, e.g., 
Jeffery et al . , Pharm. Res. (1993) 10:362-368; McGee JP, 
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et al., J" Microencapsul. 14 (2) : 197-210, 1997; O'Hagan DT, 
et.al., Vaccine 11 (2) : 149-54, 1993. 

Furthermore, other particulate systems and polymers 
can be used for the in vivo or ex vivo delivery of the 
5 gene of interest. For example, polymers such as 
polylysine, polyarginine, polyorni thine, spermine, 
spermidine, as well as conjugates of these molecules, are 
useful for transferring a nucleic acid of interest. 
Similarly, DEAE dextran-mediated transf ection, calcium 

10 phosphate precipitation or precipitation using other 

insoluble inorganic salts, such as strontium phosphate, 
aluminum silicates including bentonite and kaolin, 
chromic oxide, magnesium silicate, talc, and the like, 
will find use with the present methods. See, e.g., 

15 Feigner, P.L., Advanced Drug Delivery Reviews (1990) 

5:163-187, for a review of delivery systems useful for 
gene transfer. Peptoids (Zuckerman, R.N., et al . , U.S. 
Patent No. 5,831,005, issued November 3, 1998) may also 
be used for delivery of a construct of the present 

2 0 invention. 

Additionally, biolistic delivery systems employing 
particulate carriers such as gold and tungsten, are 
especially useful for delivering synthetic expression 
cassettes of the present invention. The particles are 

25 coated with the synthetic expression cassette (s) to be 
delivered and accelerated to high velocity, generally 
under a reduced atmosphere, using a gun powder discharge 
from a "gene gun." For a description of such techniques, 
and apparatuses useful therefore, see, e.g., U.S. Patent 

30 Nos. 4/.,945,050; 5,036,006; 5,100,792; 5,179,022; 

5,371,015; and 5,478,744. Also, needle-less injection 
systems can be used (Davis, H.L., et al, Vaccine 12:1503- 
1509, 1994; Bioject, Inc., Portland, OR). 
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Recombinant vectors carrying a synthetic expression 
cassette of the present invention are formulated into 
compositions for delivery to the vertebrate subject. 
These compositions may either be prophylactic (to prevent 
5 infection) or therapeutic (to treat disease after 
infection) . The compositions will comprise a 
"therapeutically effective amount" of the gene of 
interest such that an amount of the antigen can be 
produced in vivo so that an immune response is generated 
10 in the individual to which it is administered. The exact 
amount necessary will vary depending on the subject being 
treated; the age and general condition of the subject to 
be treated; the capacity of the subject's immune system 
to synthesize antibodies; the degree of protection 
15 desired; the severity of the condition being treated; the 
particular antigen selected and its mode of 
administration, among other factors. An appropriate 
effective amount can be readily determined by one of 
skill in the art. Thus, a "therapeutically effective 
20 amount" will fall in a relatively broad range that can be 
determined through routine trials. 

The compositions will generally include one or more 
"pharmaceutically acceptable excipients or vehicles" such 
as water, saline, glycerol, polyethyleneglycol , 
25 hyaluronic acid, ethanol, etc. Additionally, auxiliary 
substances, such as wetting or emulsifying agents, pH 
buffering substances, surfactants and the like, may be 
present in such vehicles. Certain facilitators of 
immunogenicity or of nucleic acid uptake and/or 
30 expression can also be included in the compositions or 

coadministered, such as, but not limited to, bupivacaine, 
cardiotoxin and sucrose. 

Once formulated, the compositions of the invention 
can be administered directly to the subject (e.g., as 
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described above) or, alternatively, delivered ex vivo, tc 
cells derived from the subject, using methods such as 
those described above. For example, methods for the ex 
vivo delivery and reimplantation of transformed cells 
5 into a subject are known in the art and can include, 
e.g., dextran-mediated transf ection, calcium phosphate 
precipitation, polybrene mediated transf ection, 
lipofectamine and LT-1 mediated transf ection, protoplast 
fusion, electroporation, encapsulation of the 
10 polynucleotide (s) (with or without the corresponding 

antigen) in liposomes, and direct microinjection of the 
DNA into nuclei , 

Direct delivery of synthetic expression cassette 
compositions in vivo will generally be accomplished with 
15 or without viral vectors, as described above, by 

injection using either a conventional syringe, needless 
devices such as Bioject® or a gene gun, such as the 
Accell® gene delivery system (PowderJect Technologies, 
Inc., Oxford, England). The constructs can be delivered 
20 (e.g., injected) either subcutaneously, epidermally, 
intradermally, intramuscularly, intravenous, 
intramucosally (such as nasally, rectally and vaginally) , 
intraperitoneally or orally. Delivery of DNA into cells 
of the epidermis is particularly preferred as this mode 
25 of administration provides access to skin-associated 

lymphoid cells and provides for a transient presence of 
DNA in the recipient. Other modes of administration 
include oral ingestion and pulmonary administration, 
suppositories, needle-less injection, transcutaneous and 
30 transdermal applications. Dosage treatment may be a 
single dose schedule or a multiple dose schedule. 
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2*3*2 Ex vivo Delivery of the synthetic expression 

CASSETTES OF THE PRESENT INVENTION 

In one embodiment, T cells, and related cell types 
(including but not limited to antigen presenting cells, 
5 such as, macrophage, monocytes, lymphoid cells, dendritic 
cells, B-cells, T-cells, stem cells, and progenitor cells 
thereof) , can be used for ex vivo delivery of the 
synthetic expression cassettes of the present invention. 
T cells can be isolated from peripheral blood lymphocytes 
10 (PBLs) by a variety of procedures known to those skilled 
in the art. For example, T cell populations can be 
"enriched" from a population of PBLs through the removal 
of accessory and B cells. In particular, T cell 
enrichment can be accomplished by the elimination of non- 
15 T cells using anti-MHC class II monoclonal antibodies. 
Similarly, other antibodies can be used to deplete 
specific populations of non-T cells. For example, anti- 
Ig antibody molecules can be used to deplete B cells and 
ant i -Mad antibody molecules can be used to deplete 
2 0 macrophages . 

T cells can be further fractionated into a number of 
different subpopulations by techniques known to those 
skilled in the art. Two major subpopulations can be 
isolated based on their differential expression of the 
25 cell surface markers CD4 and CD8 . For example, following 
the enrichment of T cells as described above, CD4 + cells 
can be enriched using antibodies specific for CD4 (see 
Coligan et al . , supra). The antibodies may be coupled to 
a solid support such as magnetic beads. Conversely, CD8 + 
30 cells can 

be enriched through the use of antibodies specific for 
CD4 (to remove CD4* cells) , or can be isolated by the use 
of CD 8 antibodies coupled to a solid support. CD4 
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lymphocytes from HIV-1 infected patients can be expanded 
ex vivo, before or 

after transduction as described by Wilson et . al . (1995) 
J". Infect, Dis. 172 :88. 
5 Following purification of T cells, a variety of 

methods of genetic modification known to those skilled in 
the art can be performed using non-viral or viral -based 
gene transfer vectors constructed as described herein. 
For example, one such approach involves transduction of 
10 the purified T cell population with vector-containing 
supernatant of cultures derived from vector producing 
cells. A second approach involves co- cultivation of an 
irradiated monolayer of vector-producing cells with the 
purified T cells. A third approach involves a similar 
15 co-cultivation approach; however, the purified T cells 

are pre-stimulated with various cytokines and cultured 48 
hours prior to the co-cultivation with the irradiated 
vector producing cells. Pre-stimulation. prior to such 
transduction increases effective gene transfer (Nolta et 
20 al. (1992) Exp. Hematol . 20:1065). Stimulation of these 
cultures to proliferate also provides increased cell 
populations for re-infusion into the patient. Subsequent 
to co-cultivation, T cells are collected from the vector 
producing cell monolayer, expanded, and frozen in liquid 
2 5 nitrogen. 

Gene transfer vectors, containing one or more 
synthetic expression cassette of the present invention 
(associated with appropriate control elements for 
delivery to the isolated T cells) can be assembled using 
30 known methods. 

Selectable markers can also be used in the 
construction of gene transfer vectors. For example, a 
marker can be used which imparts to a mammalian cell 
transduced with the gene transfer vector resistance to a 
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cytotoxic agent. The cytotoxic agent can be, but is not 
limited to, neomycin, aminoglycoside, tetracycline, 
chloramphenicol , sulfonamide, actinomycin, netropsin, 
distarnycin A, anthracycline, or pyrazinamide . For 
example, neomycin phosphotransferase II imparts 
resistance to the neomycin analogue genet icin (G418) . 

The T cells can also be maintained in a medium 
containing at least one type of growth factor prior to 
being selected. A variety of growth factors are known in 
the art which sustain the growth of a particular cell 
type. Examples of such growth factors are cytokine 
mitogens such as rIL-2, IL-10, IL-12, and IL-15, which 
prom ? te 9rowth and activation of lymphocytes. Certain 
types of cells are stimulated by other growth factors 
such as hormones, including human chorionic gonadotropin 
(hCG) and human growth hormone. The selection of an 
appropriate growth factor for a particular cell 
population is readily accomplished by one of skill in the 
art . 

For example, white blood cells such as 
differentiated progenitor and stem cells are stimulated 
by a variety of growth factors. More particularly, IL-3, 
IL-4, IL-5, IL-6, IL-9, GM-CSF, M-CSF, and G-CSF, 
produced by activated T H and activated macrophages, 
25 stimulate myeloid stem cells, which then differentiate 
into pluripotent stem cells, granulocyte -monocyte 
progenitors, eosinophil progenitors, basophil 
progenitors, megakaryocytes, and erythroid progenitors. 
Differentiation is modulated by growth factors such as 
GM-CSF, IL-3, IL-6, IL-11, and EPO. 

Pluripotent stem cells then differentiate into 
lymphoid stem cells, bone marrow stromal cells, T cell 
progenitors, B cell progenitors, thymocytes, T H Cells, T c 
cells, and B cells. This differentiation is modulated by 
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growth factors such as IL-3, IL-4, IL-6, IL-7, GM-CSF, 
M-CSF, G-CSF, IL-2, and IL-5. 

Granulocyte -monocyte progenitors differentiate to 
monocytes, macrophages, and neutrophils. Such 
5 differentiation is modulated by the growth factors 
GM-CSF, M-CSF, and IL-8. Eosinophil progenitors 
differentiate into eosinophils. This process is 
modulated by GM-CSF and IL-5. 

The differentiation of basophil progenitors into 
10 mast cells and basophils is modulated by GM-CSF, IL-4, 
and IL-9. Megakaryocytes produce platelets in response 
to GM-CSF, EPO, and IL-6. Erythroid progenitor cells 
differentiate into red blood cells in response to EPO. 

Thus, during activation by the CD3 -binding agent, T 
cells can also be contacted with a mitogen, for example a 
cytokine such as IL-2,. In particularly preferred 
embodiments, the IL-2 is added to the population of T 
cells at a concentration of about 50 to 100 /ig/ml . 
Activation with the CD3 -binding agent can be carried out 
20 for 2 to 4 days. 

Once suitably activated, the T cells are genetically 
modified by contacting the same with a suitable gene 
transfer vector under conditions that allow for 
transfection of the vectors into the T cells. Genetic 
25 modification is carried out when the cell density of the 
T cell population is between about 0.1 x 10 6 and 5 x 10 6 , 
preferably between about 0.5 x 10 6 and 2 x 10 6 . A number 
of suitable viral and nonviral -based 'gene transfer 
vectors have been described for use herein. 

After transduction, transduced cells are selected 
away from non- transduced cells using known techniques. 
For example, if the gene transfer vector used in the 
transduction includes a selectable marker which confers 
resistance to a cytotoxic agent, the cells can be 
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contacted with the appropriate cytotoxic agent, whereby 
non- transduced cells can be negatively selected away from 
the transduced cells. If the selectable marker is a cell 
surface marker, the cells can be contacted with a binding 
5 agent specific for the particular cell surface marker, 
whereby the transduced cells can be positively selected 
away from the population. The selection step can also 
entail fluorescence-activated cell sorting (FACS) 
techniques, such as where FACS is used to select cells 
10 from the population containing a particular surface 
marker, or the selection step can entail the use of 
magnetically responsive particles as retrievable supports 
for target cell capture and/or background removal . 
More particularly, positive selection of the 
15 transduced cells can be performed using a FACS cell 

sorter (e.g. a FACSVantage™ Cell Sorter, Becton Dickinson 
Immunocytometry Systems, San Jose, CA) to sort and 
collect transduced cells expressing a selectable cell 
surface marker. Following transduction, the cells are 
20 stained with fluorescent-labeled antibody molecules 

directed against the particular cell surface marker. The 
amount of bound antibody on each cell can be measured by 
passing droplets containing the cells through the cell 
sorter. By imparting an electromagnetic charge to 
25 droplets containing the stained cells, the transduced 

cells can be separated from other cells. The positively 
selected cells are then harvested in sterile collection 
vessels. These cell sorting procedures are described in 
detail, for example, in the FACSVantage 1 " Training Manual, 
with particular reference to sections 3-11 to 3-28 and 
10-1 to 10-17. 

Positive selection of the transduced cells can also 
be performed using magnetic separation of cells based on 
expression or a particular cell surface marker. In such 



30 



108 



WO 00/39302 



PCI7US99/31245 



separation techniques, cells to be positively selected 
are first contacted with specific binding agent (e.g., an 
antibody or reagent the interacts specifically with the 
cell surface marker) . The cells are then contacted with 
5 retrievable particles (e.g., magnetically responsive 
particles) which are coupled with a reagent that binds 
the specific binding agent (that has bound to the 
positive cells) . The cell-binding agent-particle complex 
can then be physically separated from non- labeled cells, 

10 for example using a magnetic field. When using 

magnetically responsive particles, the labeled cells can 
be retained in a container using a magnetic filed while 
the negative cells are removed. These and similar 
separation procedures are known to those of ordinary 

15 skill in the art. 

Expression of the vector in the selected transduced 
cells can be assessed by a number of assays known to 
those skilled in the art. For example, Western blot or 
Northern analysis can be employed depending on the nature 

20 of the inserted nucleotide sequence of interest. Once 
expression has been established and the transformed T 
cells have been tested for the presence of the selected 
synthetic expression cassette, they are ready for 
infusion into a patient via the peripheral blood stream. 

25 The invention includes a kit for genetic 

modification of an ex vivo population of primary 
mammalian cells. The kit typically contains a gene 
transfer vector coding for at least one selectable marker 
and at least one synthetic expression cassette contained 

3 0 in one or more containers, ancillary reagents or 
hardware, and instructions for use of the kit. 
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Experimental 

Below are examples of specific embodiments for 
carrying out the present invention. The examples are 
offered for illustrative purposes only, and are not 
intended to limit the scope of the present invention in 

any way. 

Efforts have been made to ensure accuracy with 
respect to numbers used (e.g., amounts, temperatures, 
etc.), but some experimental error and deviation should, 
of course, be allowed for. 
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The Gag (SEQ ID NO:l), Gag-protease (SEQ ID NO:2) , 
Gag-polymerase (SEQ ID NO: 3) , and Gag- reverse 
transcriptase (SEQ ID N0:77) coding sequences were 
selected from the HIV-1SF2 strain (Sanchez-Pescador, R., 
et al., Science 227(4686) : 484-492, 1985; Luciw, P. A . , et 
al . U.S. Patent No. 5,156, 949, issued October 20, 1992; 
Luciw, P. A., et al., U.S. Patent No. 5,688,688, November 
18, 1997). These sequences were manipulated to maximize 
expression of their gene products. 

First, the HIV-1 codon usage pattern was modified so 
that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes. The HIV codon usage reflects a high content of the 
nucleotides A or T of the codon- triplet. The effect of 
the HIV-1 codon usage is a high AT content in the DNA 
sequence that results in a high AU content in the RNA and 
in a. decreased translation ability and instability of the 
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mRNA. In comparison, highly expressed human codons 
prefer the nucleotides G or C. The Gag-encoding 
sequences were modified to be comparable to codon usage 
found in highly expressed human genes. 

Figure 11 presents a comparison of the percent A-T 
content for the cDNAs of stable versus unstable RNAs 
(comparison window size = 50) . Human IFNy mRNA is known 
to (i) be unstable, (ii) have a short half-life, and 
(iii) have a high A-U content. Human GAPDH 
(glyceraldehyde-3 -phosphate dehydrogenase) mRNA is known 
to (i) be a stable RNA, and (i) have a low A-U content. 
In Figure 11, the percent A-T content of these two 
sequences are compared to the percent A-T content of 
native HIV-1SF2 Gag cDNA and to the synthetic Gag cDNA 
sequence of the present invention. The top two panels of 
the figure show the percent A-T content over the length 
of the sequences for IFNy and native Gag. The bottom two 
panels of the figure show the percent A-T content over 
the length of the sequences for GAPDH and the synthetic 
Gag. Experiments performed in support of the present 
invention showed that the synthetic Gag sequences were 
capable of higher level of protein production (see the 
Examples) than the native Gag sequences. The data in 
Figure 11 suggest that one reason for this increased 
production may be increased stability of the mRNA 
corresponding to the synthetic Gag coding sequences 
versus the mRNA corresponding to the native Gag coding 
sequences . 

Second, there are inhibitory (or instability) 
elements (INS) located within the coding sequences of the 
Gag and Gag-protease coding sequences (Schneider R, et 
al. f J Virol. 71 (7) :4892-4903, 1997). RRE is a secondary 
RNA structure that interacts with the HIV encoded Rev- 
protein to overcome the expression down -regulating 
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effects of the INS. To overcome the requirement for 
post -transcriptional activating mechanisms of RRE and 
Rev, and to enhance independent expression of the Gag 
polypeptide, the INS were inactivated by introducing 
. multiple point mutations that did not alter the reading 
frame of the encoded proteins. Figure 1 shows the 
original SF2 Gag sequence, the location of the INS 
sequences, and the modifications made to the INS 
sequences to reduce their effects. 

For the Gag -protease sequence (wild type, SEQ ID 
N0:2; synthetic, SEQ id NO s:5, 78 and 79), the changes in 
codon usage were restricted to the regions up to the -i 
frameshift and starting again at the end of the Gag 
, reading, frame (Figure 2; the region indicated in lower 
case letters in Figure 2 is the unmodified region) 
Further, inhibitory (or instability) elements (INS) 
located within the coding sequences of the Gag-protease 
polypeptide coding sequence were altered as well 
(indicated in Figure 2). The synthetic coding sequences 
were assembled by the Midland Certified Reagent Company 
(Midland, Texas) . 

Modification of the Gag -polymerase sequences (wild 
type, SEQ ID NO:3; synthetic, SEQ ID NO:6) and Gag- 
reverse transcriptase sequences (SEQ ID NOs:80 through 
84) include similar modifications as described for Gag- 
protease in order to preserve the frameshift region 
Locations of the inactivation sites and changes to the 
sequence to alter the inactivation sites are presented in 
Figure 12 for the native HIV-1 SF2 Gag -polymerase sequence 

In one embodiment of the invention, the full length 
polymerase coding region of the Gag-polymerase sequence 
as included with the synthetic Gag sequences in order to 
increase the number of epitopes for virus-like particles 
expressed by the synthetic, optimized Gag expression 
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cassette. Because synthetic HIV-l Gag-polymerase 
expresses the potentially deleterious functional enzymes 
reverse transcriptase (RT) and integrase (INT) (in 
addition to the structural proteins and protease) , it is 
5 important to inactivate RT and INT functions. Several 

in- frame deletions in the RT and INT reading frame can be 
made to achieve catalytic nonfunctional enzymes with 
respect to their RT and INT activity. {Jay. A. Levy 
(Editor) (1995) The Retroviridae , Plenum Press, New York. 
10 ISBN 0-306-45033X. Pages 215-20; Grimison, B. and 
Laurence, J. (1995), Journal Of Acquired Immune 
Deficiency Syndromes and Human Retrovirology 9 (1) : 58-68; 
Wakefield, J. K.,et al . , (1992) Journal Of Virology 
66 (11) :6806-6812; Esnouf, R.,et al . , (1995) Mature 
15 Structural Biology 2 (4) :303-308; Maignan, S., et al . , 

(1998) Journal Of Molecular Biology 282(2) : 359-368; Katz, 
R. A. and Skalka, A. M. (1994) Annual Review Of. 
Biochemistry 13 (1994); Jacobo-Molina, A., et al. , (1993) 
Proceedings Of the National Academy Of Sciences Of the 
20 United States Of America 90 (13) : 6320-6324 ; Hickman, A. 
B., et al., (1994) Journal Of Biological Chemistry 
269(46) :29279-29287; Goldgur, Y . , et al . , (1998) 
Proceedings Of the National Academy Of Sciences Of the 
United States Of America 95 (16) : 9150-9154 ; Goette, M . , et 
25 al., (1998) Journal Of Biological Chemistry 

273 (17) :10139-10146; Gorton, J. L. , et al . , (1998) 
Journal of Virology 72 (6) : 5046-5055 ; Engelman, A., et 
al., (1997) Journal Of Virology 71 (5) : 3507-3514 ; Dyda, 
F., et al., Science 266 (5193) :1981-1986; Davies, J. F. , 
!0 et al., (1991) Science 252 (5002) :88-95; Bujacz, G. , et 

al., (1996) Febs Letters 398 (2-3) :175-178; Beard, W. A., 
et al., (1996) Journal Of Biological Chemistry 
271(21) :12213-12220; Kohlstaedt, L. A., et al . , (1992) 
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Science 256 (5065) : 1783-1790; Krug, M . S. and Berger, S. 
L. (1991) Biochemistry 30(44) : 10614 -10623; Mazumder, A., 
et al., (1996) Molecular Pharmacology 49 (4 ) : 621 -628 ; 
Palaniappan, C, et al . , (1997) Journal Of Biological 
5 Chemistry 272 (17) : 11157-11164; Rodgers, D. W. et al . , 

(1995) Proceedings Of the National Academy Of Sciences Of 
the United States Of America 92 (4 ) : 1222 -1226 ;. Sheng, N. 
and Dennis, D. (1993) Biochemistry 32 (18) :4938-4942 ; 
Spence, R. A., et al . , (1995) Science 267 (5200) : 988-993 . } 
10 Furthermore selected B- and/or T-cell epitopes can 

be added to the Gag-polymerase constructs within the 
deletions of the RT- and INT- coding sequence to replace 
and augment any epitopes deleted by the functional 
modifications of RT and INT. Alternately, selected B- 
15 and T-cell epitopes (including CTL epitopes) from RT and 
INT can be included in a minimal VLP formed by expression 
of the synthetic Gag or synthetic GagProt cassette, 
described above. (For descriptions of known HIV B- and T- 
cell epitopes see, HIV Molecular Immunology Database CTL 
20 Search Interface; Los Alamos Sequence Compendia, 1987- 
1997; Internet address: http://hiv- 
web . lanl . gov/ immunology/ index . html . ) 

The resulting modified coding sequences are 
presented as a synthetic Gag expression cassette (SEQ ID 
N0:4), a synthetic Gag-protease expression cassette (SEQ 
ID NOs:5, 78 and 79), and a synthetic Gag-polymerase 
expression cassette (SEQ ID NO: 6) . Synthetic expression 
cassettes containing codon modifications in the reverse 
transcriptase region are shown in SEQ ID NOs:80 through 
30 84. An alignment of selected sequences is presented in 
Figure 7. A common region (Gag- common; SEQ ID NO: 9) 
extends from position 1 to position 1262. 

The synthetic DNA fragments for Gag and Gag-protease 
were cloned into the following expression vectors: 
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P CMVKm2, for transient expression assays and DNA 
immunization studies, the P CMVKm2 vector was derived from 
pCMV6a (Chapman et al . , Nuc. Acids Res. (1991) 19:39 79 - 
3986) and comprises a kanamycin selectable marker a 
5 ColEl origin of replication, a CMV promoter enhancer and 
Intron A, followed by an insertion site for the synthetic 
sequences described below followed by a polyadenylation 
signal derived from bovine growth hormone -- the P CMVKm2 
vector differs from the pCMV-link vector only in that a 
10 polylinker site was inserted into P CMVKm2 to generate 
PCMV-link (Figure 14, polylinker at positions 1646 to 
1697); P ESN2dhfr (Figure 13A) and pCMVPLEdhfr (also known 
as pCMVIII as shown in Figure 13B) , for expression in 
Chinese Hamster Ovary (CHO) cells; and, P AcC13, a shuttle 
15 vector for use in the Baculovirus expression system 

(PACC13, was derived from P AcC12 which was described by 
Munemitsu S., et al . , Mo l Cell Biol. 10 (11) : 5977-5982 
1990) . 

A restriction map for vector pCMV-link is presented 
20 ln Figure 14. In the figure, the CMV promoter (CMV IE 
ENH/PRO) , bovine growth hormone terminator (BGH pA) 
kanamycin selectable marker (kan) , and a ColEl origin of 
replication (ColEl ori) are indicated. A polycloning 
site is also indicated in the figure following the CMV 
25 promoter sequences. 

A restriction map for vector pESN2dhfr is presented 
in Figure 13A. m the fig Ure . the CMV promoter (pCMV, 
hCMVIE), bovine growth hormone terminator (BGHpA) , SV4 0 
origin of replication (SV40ori) , neomycin selectable 
30 marker (Neo) , SV40 polyA (SV40pA) , Adenovirus 2 late 

promoter (Ad2VLP) , and the murine dhfr gene (mu dhfr) are 
indicated. A polycloning site is also indicated in the 
figure following the CMV promoter sequences. 
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Briefly, construction of pCMVPLEdhfr (pCMVIli) was 
as follows. To construct a DHFR cassette, the EMCV IRES 
(internal ribosome entry site) leader was PCR-amplif ied 
from pCite-4a + (Novagen, Inc., Milwaukee, WI) and 
inserted into pET-23d (Novagen, Inc., Milwaukee, WI) as 
an Xba-Nco fragment to give pET-EMCV. The dhfr gene was 
PCR-amplif ied from pESN2dhfr to give a product with a 
Gly-Gly-Gly-Ser spacer in place of the translation stop 
codon and inserted as an Mco-BarriHi fragment to give pET- 
E-DHFR. Next, the attenuated neo gene was PCR amplified 
from a P SV2Neo (Clontech, Palo Alto, CA) derivative and 
inserted into the unique BarhUl site of pET-E-DHFR to give 
pET-E-DHFR/Neo (m2) . Then, the bovine growth hormone 
terminator from pCDNA3 (Invitrogen, Inc., Carlsbad, CA) 
was inserted downstream of the neo gene to give pET-E- 
DHFR/Neo (m2 ,BGHt . The EMCV- dhfr /neo selectable marker 
cassette fragment was prepared by cleavage of pET-E- 
DHFR/Neo (m2) BGHt . The CMV enhancer/promoter plus Intron A 
was transferred from pCMV6a (Chapman et al . , Nuc. Acids 
Res. (1991) 19:3979-3986) as a Hindlll-SaJl fragment into 
PUC19 (New England Biolabs, Inc., Beverly, MA). The 
vector backbone of pUCIS was deleted from the Ndel to the 
Sapl sites. The above described DHFR cassette was added 
to the construct such that the EMCV IRES followed the CMV 
25 promoter to produce the final construct. The vector also 
contained an amp' gene and an SV40 origin of replication. 

Selected P CMVKm2 vectors containing the synthetic 
expression cassettes have been designated as follows: 
pCMVKm2.GagMod.SF2, pCMVKm2.GagprotMod.SF2, and 
pCMVKm2.GagpolMod.SF2, pCMVKm2.GagprotMod.SF2.GPl (SEQ ID 
NO:78) and pCMVKm2.GagprotMod.SF2.GP2 (SEQ ID NO:79). 
Other exemplary Gag-encoding expressing cassettes are 
shown in the Figures and as Sequence Listings. 
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Modification of HIV-l Gaor/H^p atitis c r or e Phim S Hn 
Protein Nucleic Acid Coding SeoiiPnrP S Generation of 
Synthet ic Expression Cassettes 

To facilitate the ligation of the Gag and HCV core 
coding sequences, PCR amplification was employed. The 
synthetic p55Gag expression cassette was used as a PCR 
template with the following primers: GAG5 (SEQ ID NO.-ll) 
and P55-SAL3 (SEQ ID NO:12). The PCR amplification was 
conducted at 55°C for 25 cycles using Stratagene's Pfu 
polymerase. The resulting PCR product was rendered free 
of nucleotides and primers using the Promega PCR clean-up 
kit and then subjected to EcoRI and Sail digestions. For 
HCV core coding sequences, the following primers were 
used with an HCV template (Houghton, M. , et al . , U.S. 
Patent No. 5,714,596, issued February 3, 1998; Houghton, 
M., et al., U.S. Patent No. 5,712,088, issued January 27, 
1998; Houghton, M. , et al . , U.S. Patent No. 5,683,864, 
issued November 4, 1997; Weiner, A.J. , et al . , U.S. 
Patent No. 5,728,520, issued March 17, 1998; Weiner, 
A.J., et al., U.S. Patent No. 5,766,845, issued June 16, 
1998; Weiner, A.J., et al . , U.S. Patent No. 5,670,152, 
issued September 23, 1997): CORESAL 5 (SEQ ID NO:13) and 
173CORE(SEQ ID NO: 14) using the conditions outlined 
above. The purified product was digested with Sail and 
BamHI restriction enzymes. The digested Gag and HCV core 
PCR products were ligated into the pCMVKm2 vector 
digested with EcoRI and BamHI. Ligation of the PCR 
products at the Sail site resulted in a direct fusion of 
the final amino acid of P 55Gag to the second amino acid 
of HCV core, serine. Amino acid 173 of core is a serine 
and is followed immediately by a TAG termination codon. 
The sequence of the fusion clone was confirmed. The 
pCMVKm2 vector containing the synthetic expression 
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cassette was designated as P CMVKm2 . GagModHCVcore 

The EcoRI -BamHI fragment of p55Gag-core 173 was also 
cloned into EcoRI -BamHI -digested pAcC13 for baculovirus 
expression. Western blots confirmed expression and 
• sucrose gradient sedimentation along with electron 

microscopy confirmed particle formation. To generate the 
above clone but containing the synthetic Gag sequences 
instead of wild-type) , the following steps were 
performed: P CMVKm2 -modified p55Gag was used as template 
for PGR amplification with MS65 (SEQ ID NO:15) and 
MS66(SEQ ID NO:16) primers . The region amplified 
corresponds to the BspHI and Sail site s at the C-terminus 
of synthetic Gag sequence. The amplification product was 
digested with BspHI and Sail and ligated to Sall/BamHI 
jested pCMV-li nk along with the Sal/BspHI fragment from 
pCMV-Km-p 55 modGag . representing the amino terminal end 
of modified Gag, and the Sall/BamHI fragment from pCMV- 
p55Gag-corel73. Thereafter, a T4 -blunted-Sall 
partial/BamHI fragment was ligated into pAcC4-SmaI /BamHI 
to generate pAcC4-p 5 5GagMod-corel73 (containing the 
synthetic sequence presented as SEQ ID NO: 7) . 

C ' De ^"i"q of rh. JMjorji ompW RfteHftn f Min>1 

P55Gag ~ 

The Major Homology Region (MHR) of HIV-l p55 (Gag) 
as located in the p24 -CA sequence of Gag. it is a 
conserved stretch of 20 amino acids (SEQ ID NO:l 9) The 

position in the wild tvoe HTV i o= 

c ™ e HIV -1 SF2 Gag protein is from aa 

286-305 and spans a region from nucleotides 856-915 in 
the native HIV-l SF2 Gag DNA-sequence . The position i n the 
synthetic Gag protein is from aa 288-307 and spans a 
region from 

nudeotides 862- S2l £or the synthetic ^ 

The nucleotide sequence £ or the MHR in the synthetic 
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GagMod.SF2 is presented as SEQ ID NO: 20. Mutations or 
deletions in the amino acid sequence of the MHR can 
severely impair particle production (Borsetti, A., et 
al., J. Virol. 72(11) :9313-9317, 1998; Mammano, F., et 
'< al., J Virol 68 (8) :4927-4936, 1994). 

Percent identity to the MHR nucleotide sequence can 
be determined, for example, using the MacDNAsis program 
(Hitachi Software Engineering America Limited, South San 
Francisco, CA) , Higgins algorithm, with the following 
exemplary parameters: gap penalty = 5, no. of top 
diagonals = 5, fixed gap penalty = 5, K- tuple = 2, window 
size = 5, and floating gap penalty = 10. 

D. Generation of Svnfh etic Env Expression Cs^tt^ 
15 Env coding sequences of the present invention 

include, but are not limited to, polynucleotide sequences 
encoding the following HIV-encoded polypeptides: gpl60, 
gpl40, and gpl2 0 (see, e.g., U.S. Patent No. 5,792,459 
for a description of the HIV-1 SF2 ( "SF2 " ) Env 
20 polypeptide) . The relationships between these 

polypeptides is shown schematically in Figure 15 (in the 
figure: the polypeptides are indicated as lines, the 
amino and carboxy termini are indicated on the gpl60 
line; the open circle represents the oligomerization 
25 domain; the open square represents a transmembrane 

spanning domain (TM) ; and «c" represents the location of 
a cleavage site, in gpl40.mut the "X" indicates that the 
cleavage site has been mutated such that it no longer 
functions as a cleavage site) . The polypeptide gpl60 
includes the coding sequences for gpl2 0 and gp41. The 
polypeptide gp41 is comprised of several domains 
including an oligomerization domain (OD) and a 
transmembrane spanning domain (TM) . in the native 
envelope, the oligomerization domain is required for the 
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non-covalent association of three gp41 polypeptides to 
form a trimeric structure: through non-covalent 
interactions with the g P 41 trimer (and itself), the ar>120 
polypeptides are also organized in a trimeric structure 
A cleavage site (or cleavage sites) exists approximately 
between the polypeptide sequences for gpi20 and the 
polypeptide sequences corresponding to gp41. This 
cleavage site(s) can be mutated to prevent cleavage at 
the site. The resulting gpi40 polypeptide corresponds to 
a truncated form of gpieo where the transmembrane 
spanning domain of gp4 i has been deleted. This gpi4 0 
polypeptide can exist in both monomeric and oligomeri- 
(i.e. trimeric) forms by virtue of the presence of the 
oligomerization domain in the g P 41 moiety. m the 
situation where the cleavage site has been mutated to 
prevent cleavage and the transmembrane portion of gp 4 l 
has been deleted the resulting polypeptide product is 
designated -mutated- g P 140 (e.g., gpl40.mut). As will be 
apparent to those in the field, the cleavage site can be 
mutated in a variety of ways. The native amino acid 
sequence in the SF162 cleavage sites i S: APTKAKRRWQREKR 
(SEQ ID NO:21), where KAKRR (SEQ ID NO:22) is termed the 
"second" site and REKR (SEQ ID NO: 23) is the "first 
site". Exemplary mutations include the following 
constructs: g P 14 0 . mut7 . modSF162 which encodes the amino 
acid sequence APTKAISSWQSEKS (SEQ ID NO:24) in the 
cleavage site region; gp!40 . mute . modSF162 which encodes 
the amino acid sequence APTIAISSWQSEKS (SEQ ID NO: 25) in 
the cleavage site region and g P 140mut . modSF162 which ' 
encodes the amino acid sequence APTKAKRRWQREKS (SEQ ID 
NO: 26) . Mutations are denoted in bold. The native amino 
acid sequence in the US 4 cleavage sites is- 
APTQAKRRWQREKR (SEQ ID NO: 27), where QAKRR (SEQ ID 
NO:28) is termed the "second" site and REKR (SEQ ID 
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NO:23) is the -first site". Exemplary mutations include 
the following construct: gpl40 .mut .modUS4 which encodes 
the amino acid sequence APTQAKRRWQREKS (SEQ ID NO: 29) in 
the cleavage site region. Mutations are denoted in bold. 

5 

^ Modification of HTV-1 Env (Rn vel onP ) w.Ti ein Ar i H 
Coding Sequences 

In one embodiment of the present invention, wild- 
type Env coding sequences were selected from the HIV-1 
10 CSF162") strain (Cheng-Mayer (1989) PNAS USA 86:8575- 
8579). These SF162 sequences were as follows: gpl20, 
SEQ ID NO: 30 (Fig. 16);gpl40, SEQ ID NO:31 (Fig. 17)'- 
and gpl60, SEQ ID NO:32 (Fig. 18). 

In another embodiment of the present invention, 
wild-type Env coding sequences were selected from the 
HIV-US4 strain (Mascola, et al . (1994) J. Infect. Dis . 
169:48-54). These US 4 sequences were as follows: gp i20, 
SEQIDNO:51 (Fig. 38); g P 140, SEQ ID NO:52 (Fig. 39); 
and gpl60, SEQ ID N0:53 (Fig. 40) . 

These Env coding sequences were manipulated to 
maximize expression of their gene products. 

First, the wild-type coding region was modified in 
one or more of the following ways. In one embodiment, 
sequences encoding hypervariable regions of Env, 
particularly VI and/or V2 were deleted. In other 
embodiments, mutations were introduced into sequences 
encoding the cleavage site in Env to abrogate the 
enzymatic cleavage of oligomeric gpl40 into gpl20 
monomers. (See, e.g., Earl et al . (1990) PNAS USA 
87:648-652; Earl et al . (1991) j. ViroJ . 65:31-41). In 
yet other embodiments, hypervariable region (s) were 
deleted, N-glycosylation sites were removed and/or 
cleavage sites mutated. 

Second, the HIV-l codon usage pattern was modified 
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so that the resulting nucleic acid coding sequence was 
comparable to codon usage found in highly expressed human 
genes. The HIV codon usage reflects a high content of the 
nucleotides A or T in the codon - tripl et . The effect of 
the HIV-l codon usage is a high AT content in the DNA 
sequence that results in a decreased translation ability 
and instability of the mRNA. In comparison, highly 
expressed human codons prefer the nucleotides G or C. 
The Env coding sequences were modified to be comparable 
to codon usage found in highly expressed. human genes. 

Figures 22A-22H present comparisons of the percent 
A-T content for the cDNAs of stable versus unstable RNAs 
(comparison window size = 50) . Human IFNy mRNA is known 
to (i) be unstable, (ii) have a short half-life, and 
15 (iii) have a high A-U content. Human GAPDH 

(glyceraldehyde-3 -phosphate dehydrogenase) mRNA is known 
to (i) be a stable RNA, and (i) have a low A-U content. 
In Figures 22A-H, the percent A-T content of these two 
sequences are compared to the percent A-T content of (i) 
20 native HIV-l US 4 Env gpl60 cDNA, a synthetic US 4 Env 

9P160 cDNA sequence (i.e., having modified codons) of the 
present invention; and (2) native HIV-l SF162 Env gpieo 
cDNA, a synthetic SF162 Env gpl60 cDNA sequence (i.e., 
having modified codons) of the present invention. 
Figures 22A-H show the percent A-T content over the 
length of the sequences for IFNy (Figures 22C and 22G) ; 
native gpl60 Env US 4 and SF162 (Figures 22A and 22E, 
respectively); GAPDH (Figures 22D and 22H) ; and the' 
synthetic gplSO Env for US 4 and SF162 (Figures 22B and 
30 22F). Experiments performed in support of the present 
invention showed that the synthetic Env sequences were 
capable of higher level of protein production (see the 
Examples) than the native Env sequences. The data in 
Figures 22A-H suggest that one reason for this increased 
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production is increased stability of the mRNA 
corresponding to the synthetic Env coding sequences 
. versus the mRNA corresponding to the native Env coding 
sequences . 

To create the synthetic coding sequences of the 
present invention the gene cassettes were designed to 
comprise the entire coding sequence of interest 
Synthetic gene cassettes were constructed by 
oligonucleotide synthesis and PCR amplification to 
generate gene fragments. Primers were chosen to provide 
convenient restriction sites for subcloning. The 
resulting fragments were then ligated to create the 
entire desired sequence which was then cloned into an 
appropriate vector. The final synthetic sequences were 
(i) screened by restriction endonuclease digestion and 
analysis, (ii) subjected to DNA sequencing in order to 
confirm that the desired sequence had been obtained and 
(iii) the identity and integrity of the expressed protein 
confirmed by SDS-PAGE and Western blotting (See, 
Examples. The synthetic coding sequences were assembled 
at Chiron Corp. or by the Midland Certified Reagent 
Company (Midland, Texas) . 

Exemplary modified coding sequences are presented as 
synthetic Env expression cassettes in Table 1A and IB. 
The following expression cassettes (i) have unique, 
terminal EcoRI and JCbaJ cloning sites; (ii) include Kozak 
sequences to promote optimal translation; (iii) tPA 
signal sequences (to direct the ENV polypeptide to the 
cell membrane, see, e.g., Chapman et al . , infra); (iv) 
open reading frames optimized for expression in mammalian 
cells; and (v) a translational stop signal codon. 
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Table 1A: Exemplary Synthetic Env Express! 

Cassettes (SF162) 



on 



10 



15 



20 



25 



Expression Cassette 



gpl20 SF162 



gpl4 0 SF162 



gp260 SF162 
gpl20.modSF162 
gp!2 0 . modSF162 . delV2 
gpl2 0 .modSF162 . delVl/V2 
gpi4 0.modSF162 
gpl4 0 . modSFl62 . delV2 

gpl40 ,modSF162 .delVl/V2 
gp 1 4 0 . mu t . modSFl 6 2 
gpl4 0 . mut . modSF162 . del V2 



gpl4 0.mut .modSFl62.delVl/v 
2 

gpl4 0 .m'ut7 ,modSF162 



gpl40.mut7.modSF162 .delV2 

gpl40 . mut7 . modSFl62 . delVl/ 
V2 



gpl4 0 .mute .modSF162 
gpl4 0 . mut 8 . modSF162 . del V2 



Seq 
Id 

30 



31 



32 
33 
34 



35 
36 
37 
38 
39 
40 



41 



42 



43 



44 



Further Information 



wild -type; Figure 16 



wild -type; Figure 17 



wild-type; Fi gure 18 
none; Figure 19 



deleted V2 loop; Figure 2C 



deleted Vi and V 2; Figure 21 
none ; Figure 23 
deleted V2 l oop; Figure 24 



deleted VI and V2 ; Figure 25 
mutated cleavage si te; Fig. 26 
deleted V2; mutated cleavage 
site; Figure 27 



deleted Vi & V2 ; mutated 
cleavage site; F igure 28 
mutated cleavage site; Fig. 2 9 



mutated cleavage site; deleted 
V2; Figure 3 0 

mutated cleavage site; deleted 
Viand V2; Figure 31 



gpl4 0 . mut 8 . modSFi62 . del Vl / 
V2 

gpl60.modSF162 
gpl60.modSF 162 . delV2 
gpl60 .modSFi62 .delVi/V2 



45 
46 

47 

48 



mutated cleavage s ite; Fig. 32 

mutated cleavage site; deleted 
V2; Figure 33 



mutated cleavage site; deleted 
VI and V2; Figure 34 
none; Figure 3 5 



49 
50 



deleted V2 loop; Fig ure 3 6 
deleted Vl & y 2 ; Figure 37 
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Table IB: 

Exemplary Synthetic Env Expression Cassettes (US4 ) 



5 



10 



15 



20 



Exnr***! p i on ra^QPf"t*ia 


Con 

acq 

Id 


Further Information 


CTD12 0 U<?4 


5 1 


wild -type; Figure 38 


nnl 4 .TTC4 


52 


wild -type; Figure 39 


gp!60- US 4 


53 


wild-type; Figure 40 


gp 120-.' modUS 4 


54 


none; Figure 41 


gpl20.modUS4 .del 128-194 


55 


deletion in VI and V2 regions; 
Figure 4 2 


gpl4Q \modUS4 


56 


none; Figure 4 3 


gpl4 q . mut . modUS4 


57 


mutated cleavage site; Figure 
44 


gpl4 0TM.modUS4 


58 


native transmembrane region; 
Figure 45 


gpl4 0 . modUS 4 . delVl/V2 


59 


deleted Vl and V2 ; Figure 46 


gpl4 0 . modUS 4 . del V2 


60- 


deleted Vl; Figure 4 7 


gp 1 4 0 . mu t . modUS 4.delVl/V2 


61 


mutated cleavage site; deleted 
Vl and V2; Figure 4 8 


gpl4 0.modUS4.del 128-194 


62 


deletion in VI and V2 regions; 
Figure 4 9 


ypi^ u . itiul . moaui>4 . aci 12 8- 
194 


63 


mutated cleavage site; deletion 
in VI and V2 regions; Figure 50 


gpl60 .modUS4 


64 




gp 1 6 0 . modUS 4 . de 1 VI 


65 


deleted VI; Figure 52 


gpl60.modUS4 .delV2 


66 


deleted V2 ; Figure 53 


gpl6G.modUS4 .delVl/V2 


67 


deleted VI and V2 ; Figure 54 


gpl60,modUS4del 128-194 


68 


deletion in VI and V2 • regions; 
Figure 55 



Alignments of the sequences presented in the above 
tables are presented in Figures 66A and 66B. 

A common region (Env-common) extends from nucleotide 
position 1186 to nucleotide position 1329 (SEQ ID NO: 69, 
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Fig. 56) relative to the wild-type US 4 sequence and from 
nucleotide position 1117 to position 1260 (SEQ ID NO: 79, 
Fig. 57) relative to the wild-type SF162 sequence. The 
synthetic sequences of the present invention 
corresponding to these regions are presented, as SEQ ID 
NO: 71 (Figure 58) for the synthetic Env US4 common region 
and as SEQ ID NO: 72 (Figure 59) for the synthetic Env 
SF162 common region. 

Percent identity to this sequence can be determined, 
for example, using the Smith -Waterman search algorithm 
• (Time Logic, Incline Village, NV) , with the following 
exemplary parameters: weight matrix = nuc4x4hb; gap 
opening penalty = 20, gap extension penalty =5, 
reporting threshold = i ; alignment threshold = 20. 
15 Various forms of the different embodiments of the 

present invention (e.g., constructs) may be combined. 

^ Cloning Synthetic Env Emr PS .q i on Casgpftes of rh„ 
Present Invention. 
20 The synthetic DNA fragments encoding the Env 

polypeptides were typically cloned into the eucaryotic 
expression vectors described above for Gag, for example, 
P CMVKm2/pCMVlink (Figure 4), pCMV6a, pESN2dhfr (Figure 
13A) , pCMVIII (Figure 13B; alternately designated as the 
25 pCMV-PL-E-dhfr/neo vector) . 

Exemplary designations for pCMVlink vectors 
containing synthetic expression cassettes of the present 
invention are as follows: pCMVlink. gpl4 0 .modSF162 ; 
pCMVlink . gpl4 0 . -modSF162 . delV2 ; 
30 PCMVlink. gpl40.mut. modSF162; 

pCMVlink . gpl4 0 . mut . modSF162 . delV2 ; P CMVKm2 . gpl4 0modUS4 ; 
pCMVKm2.gpl40.modUS4.delV2; P CMVKm2 .gp!40 .mut .modUS4 ; 
and, pCMVKm2 . gpl40 .mut .modUS4 .del VI /V2 . 
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G. Generation of Synthetic Tat Expression Cassettes 

Tat coding sequences have also been modified 
according to the teachings of the present specification. 
The wild type nucleotide sequence encoding tat from 
5 variant SF162 is presented in Figure 76 (SEQ ID NO: 85) . 
The corresponding wild- type amino acid sequence is 
presented in Figure 77 (SEQ ID NO:86). Figure 81 (SEQ ID 
NO: 89) shows the nucleotide sequence encoding the amino 
terminal of the tat protein and the codon encoding 

10 cystein-22 is underlined. Other exemplary constructs 

encoding synthetic tat polypeptides are shown in Figures 
78 and 79 (SEQ ID NOs:87 and 88). In one embodiment (SEQ 
ID NO:88), the cystein residue at position 22 is replaced 
by a glycine. Caputo et al . (1996) Gene Therapy 3:235 

15 have shown that this mutation affects the trans 
activation domain of Tat. 

Various forms of the different embodiments of the 
invention, described herein, may be combined. 

2 0 £L_ Deposit of Vectors 

Selected exemplary constructs shown below and 
described herein are deposited at Chiron Corporation, 
Emeryville, CA, 94662-8097, and were sent to the American 
Type Culture Collection, 10801 University Boulevard, 

25 Manassas, VA 20110-2209 on December 27, 1999. 
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10 



15 



20 



Plasmid Name 

pCMVgpi60 . modUS4 
pCMVgpi60delI .modUS4 
pCMVgpl60del2 .modUS4 
pCMVgpl60del -2 . modUS4 
pCMVgpi 6 Ode 1 12 8 - 1 94 . mod . US4 

pCMVgpl4 0mut .modUS4dell28-194 
pCMVgpi4 0 . mut . mod . US 

pCMVgpl60.modSF162 
pCMVgpl6 0 . modSF162 .delV2 
pCMVgpl60 . modSF162 .delVlV2 
pCMVgpi40.mut . modSF162delV2 
pCMVgpl40 . mut 7 . modSF162 

pCMVgpl40.mut7.modSF162delV2 
pCMVgpl4 0 . mut8 . modSF162 

pCMVgpl40 . mut 8 . modSF162delV2 

pCMVgpl40.mut8.modSF162delVlV2 
pCMVKm2 . Gagprot . Mod . SF2 . GP1 
pCMVKm2 . Gagprot . Mod . SF2 . GP2 



Chiron 
Depos it # 
5094 
5095 
5096 
5097 
5098 
5100 
5101 
5125 
5126 
5127 
5128 
5129 
5130 
5131 
5132 
5133 
5150 
5151 



Date Sent 
to ATCC 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27 Dec 99 
27. Dec 99 
27 Dec 99 



Example 0 
Expression A.gg avs for hhp 
■ Synthetic flag. Env an „ ^ t Coding 

The HIV-1SF2 wild-type Gag {SEQ ID NO:l) and Gag- 
protease (SEQ ID NO: 2) sequences were cloned into 
expression vectors having the same features as the 
vectors into which the synthetic Gag (SEQ ID NO:4) and 
Gag-protease (SEQ I D N0S :5, 78 or 79)) sequences were 
cloned. 
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Expression efficiencies for various vectors carrying 
the HIV-1SF2 wild- type and synthetic Gag sequences were 
evaluated as follows. Cells from several mammalian cell 
lines (293, RD, COS-7, and CHO; all obtained from the 
5 American Type Culture Collection, 10801 University 

Boulevard, Manassas, VA 20110-2209) were transfected with 
2 fig of DNA in transfection reagent LT1 (PanVera 
Corporation, 545 Science Dr., Madison, WI) . The cells 
were incubated for 5 hours in reduced serum medium (Opti- 
10 MEM, Gibco-BRL, Gaithersburg, MD) . The medium was then 

replaced with normal medium as follows: 293 cells, IMDM, 
10% fetal calf serum, 2% glutamine (BioWhittaker , 
Walkersville, MD) ; RD and COS-7 cells, D-MEM, 10% fetal 
calf serum, 2% glutamine (Opti-MEM, Gibco-BRL, 
15 Gaithersburg, MD) ; and CHO cells, Ham's F-12, 10% fetal 
calf serum, 2% glutamine (Opti-MEM, Gibco-BRL, 
Gaithersburg, MD) . The cells were incubated for either 
48 or 60 hours. Supernatants were harvested and filtered 
through 0.45 /xm syringe filters and, optionally, stored 
20 at -20°C. 

Supernatants were evaluated using the Coulter p24- 
assay (Coulter Corporation, Hialeah, FL, US) , using 96- 
well plates coated with a murine monoclonal antibody 
directed against HIV core antigen. The HIV-1 p24 antigen 
25 binds to the coated wells. Biotinylated antibodies 

against HIV recognize the bound p24 antigen. Conjugated 
strepavidin-horseradish peroxidase reacts with the 
biotin. Color develops from the reaction of peroxidase 
with TMB substrate. The reaction is terminated by 
addition of 4N H 2 S0 4 . The intensity of the color is 
directly proportional to the amount of HIV p24 antigen in 
a sample. 

The results of these expression assays are presented 
in Tables 2A and 2B. Tables 2A and 2B shows data 
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obtained using the synthetic Gag-protease expression 
cassette of SEQ ID NO: 5. Similar results were obtained 
using the Gag-protease expression cassettes of SEQ ID 
NOs:78 and 79. 
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Table 2: in vitro 



gag and gagprot p24 expression 



5 TABLE 2a. Increased in 
plasmids in supernatants 



vitro expression from modified vs. native aaa 
and lysates from transiently trans fected cells 



experiment 



native (nat) 4 
modified (mod) b 



nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 

nat 
mod 



supernatant (sup) 
iysate (lys) 



cell line 



sup 
sup 

sup 
sup 

sup 
sup 

sup 
sup 

lys 
lys 

sup 
sup 

sup 
sup 

lys 
lys 

sup 
sup 

sup 
sup 
lys 

lys 



293 
293 

293 
293 

293 
293 

293 
293 

293 
293 

~rF 

RD 
RD 
RD 

RD 
RD 

COS-7 
COS-7 

COS-7 
COS-7 
COS-7 
COS-7 



hours post 
trans fecrion 


total ng p24 
(fold increase) 


48 


3 4 


48 


1260 f37H 


60 


3.2 


60 


2222 (694) 


60 


1.8 


60 


1740(966) 


60 


1.8 


60 


580 (322) 


60 




60 


85 (SI) 


Ho 


5.6 


48 


66(12) 


60 


7.8 


60 


70.2 (9) 


60 


1.9 


60 


7.8(4) 




OA 


48 


33.4 (84) 


48 


0.4 


48 


10(25) 


48 


3 


48 


14(5) 



pCMVLink . Gag . SF2 . PRE 



pCMVKm2 . GagMod . SF2 
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TABLE 2b. In vitro expression from modified gag and gagprotease 
plasmids m supernatants and lysates from transiently transfected 
. cells 



plasmid 


lysate (lys) 


cell line 


hours post 


total ng p24 d 


Gag 4 


sup 


293 


60 


760 


GagProt(GPl) b 


sup 


293 


60 


380 


GagProt(GP2) 1 


sup 


293 


60 


320 


Gag 


lys 


293 


60 


78 


GagProt(GPl) 


lys 


293 


60 


1250 


GagProt(GP2) 


lys 


293 


60 


400 


Gag 


sup 


COS-7 


72 


40 


GagProt(GPl) 


sup 


COS-7 


72 


150 


GagProt(GP2) 


sup 


COS-7 


72 


290 


Gag 


lys 


COS-7 


72 


60 


GagProt(GPl) . 


lys 


COS-7 


72 


63 


GagProt(GP2) 


lys 


COS-7 


72 


58 



a pCMVKm2 . GagMod . SF2 

b pCMVKm2.GagProtMod.SF2(GPl) gagprotease with codon optimization 
and anactivation of INS in protease 

c pCMVKm2.GagProtMod.SF2(GP2) gagprotease with only inactivation 
of INS in protease 

-Shown are representative results from 3 independent experiments for 
each cell lane tested. 
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The data showed that the synthetic Gag and Gag- 
protease expression cassettes provided dramatic increases 
in production of their protein products, relative to the 
native (HIV-1SF2 wild- type) sequences, when expressed in 
5 a variety of cell lines. 

IL Env Co ding Sequences 

The HIV-SF162 (-SF162-) wild-type Env (SEQ ID NO:l- 
3) and HIV-US4 ("US4") wild- type Env (SEQ ID NO: 22 -24) 
sequences were cloned into expression vectors having the 
same features as the vectors into which the synthetic Env 
sequences were cloned. 

Expression efficiencies for various vectors carrying 
the SF162 and US 4 wild-type and synthetic Env sequences 
were evaluated essentially as described above for Gag - 
except that cell lysates were prepared in 40 /zl lysis 
buffer (1.0 % NP40, 0.1 M Tris pH 7.5) and frozen at - 
20°C and capture ELISAs were performed as follows. 

For Capture ELISAs, 250 ng of an ammonium sulfate 
IgG cut of goat polyclonal antibody to gpl20SF2/env2 -3 
was used to coat each well of a 96 -well plate (Corning, 
Corning, NY) . Serial dilutions of gpl20/SF2 protein (MID 
167) were used to set the quantitation curve from which 
expression of US 4 or SF162 gpl20 proteins from 
transfection supernatant and lysates were calculated. 
Samples were screened undiluted and, optionally, by 
serial 2-fold dilutions. A human polyclonal antibody to 
HIV-1 gpl20/SF2 was used to detect bound gpl20 envelope 
protein, followed by horse-radish peroxidase (HRP) - 
labeled goat ant i -human IgG conjugates. TMB (Pierce, 
Rockford, IL) was used as the substrate and the reaction 
is terminated by addition of 4N H 2 SO, . The 
reaction was quantified by measuring the optical density 
(OD) at 450 nm. The intensity of the color is directly 
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propo« ional to the amount of HIV gpl20 mt , gen in 
sa mple ^ rified s , 2 gpl20 protein ^ 

as a standard. 

The results of the transient expression assays are 
presented in TaM es 3 ana 4. TaMe 3 depicts transient 
expression in 293 cells tr a„s £ eoted with a pCMVK^ vector 
carrying the Env cassette of interest. TaMe 4 depicts 
transient expression in celis transacted with I 
PCMVKm2 Ve=t ° r ™*yi™ the Env cassette of interest 
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Table 4 



10 



15 



CHO Cell 


Lines Expression Level of US4 Envelope 




Constructs 




Constructs 




VPTY 

rlJLA 


Expression Level* 






Level 


(ng/ml) 


gpl20.modUS4 


J. 


3 . 2/iM 


O C ft >l IT /\ 

250-450 




O 


x . b/XrQ 


350-450 




3 


200nM 


O ft C O A 




4 


?00nM 


ou u o uu 


gp!40 .modUS4 


x 




155-300 






X /IP* 


i ft ft **% r\ 
100-2 60 




3 




O ft ft /lift 


gpl40 .mut . 


1 


1/xM 


xxU -Z f \J 


modus 4 


2 


1/xM 


100-235 




3 


1/zM 


100-220 


gpl40.modUS4 


1 


50nM 


313-587** 


.delVl/V2 


2 


50nM 


237-667** 




3 


50nM 


492-527** 


gpl40.mut . 


1 


50nM 


46-328** 


modUS4 . del VI 


2 


50nM 


82-318** 


/V2 


3 


50nM 


204-385** 



tr- — ~ .«w^ Mfc *^.w^ ■*■ * » xxcaon. ouayc uiiiJ-coc? UUlClWXSt 

indicated 

"at 24 well and 6 well plate stages 

***in a three liter bioreactor perfusion culture this clone 
yielded approximately 2-5 /xg/ml. 
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The data showed that the synthetic Env and 
expression cassettes provided a significant increase in 
production of their protein products, relative to the 
native (HIV-1SF162 or US 4 wild-type) sequences, when 
expressed in a variety of cell lines. 

£L- CHO Cell line Env expression data 

Chinese hamster ovary (CHO) cells were transfected 
with plasmid DNA encoding the synthetic HIV-1 gpl20 or 
gpl40 proteins (e.g., pESN2dhfr or pCMVIII vector 
backbone) using Mirus Trans IT- LT1 polyamine' transf ection 
reagent (Pan Vera) according to the manufacturers 
instructions and incubated for 96 hours. After 96 hours, 
media was changed to selective media (F12 special with 
250 fig/ml G418) and cells were split 1:5 and incubated 
for an additional 48 hours. Media was changed every 5-7 
days until colonies started forming at which time the 
colonies were picked, plated into 96 well plates and 
screened by gpl20 Capture ELISA. Positive clones were 
expanded in 24 well plates and screened several times for 
Env protein production by Capture ELISA, as described 
above. After reaching confluency in 24 well plates, 
positive clones were expanded to T25 flasks (Corning, 
Corning, NY) . These were screened several times after 
confluency and positive clones were expanded to T75 
flasks. 

Positive T75 clones were frozen in LN2 and the 
highest expressing clones amplified with 0-5 
methotrexate (MTX)at several concentrations and plated in 
100mm culture dishes. Plates were screened for colony 
formation and all positive closed were again expanded as 
described above. Clones were expanded an amplified and 
screened at each step by gpl20 capture ELISA. Positive 
clones were frozen at each methotrexate level. Highest 
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producing clones were grown in perfusion bioreactors (3L, 
100L) for expansion and adaptation to low serum 
suspension culture conditions for scale-up to larger 
bioreactors . 

Tables 5 and 6 show Capture EL ISA data from CHO 
cells transfected with pCMVIII vector carrying a cassette 
encoding synthetic HIV-US4 and SF162 Env polypeptides 
(e.g., mutated cleavage sites, modified codon usage 
and/or deleted hypervariable regions) . Thus, stably 
transfected CHO cell lines which express Env polypeptides 
(e.g., g P 120, gpl40-monomeric, and gpl40-oligomeric) have 
been produced. 
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Table 5 



CHO Cell 


Lines Expression Level of US4 Envelope 




Constructs 




Constructs 


CHO Clone # 


MTX 


Expression Level* 






Level 


(ng/ml) 


qn 120 modUS 4 


1 


3 ,2/zM 


250-450 




2 


1.6/iM 


350-450 




3 


200nM 


230-580*** 




4 


200nM 


300-500 


gpl4 0 . modUS 4 


1 


1/zM 


155-300 




2 




100-260 




j 


1/iM 


200-430 


on 14 0 mi 1 1~ 


1 


1/iM 


110-270 


modUS 4 


2 


1/iM 


100-235 




3 


1/iM 


100-220 


gpl4 0.modUS4 


1 


50nM 


313-587** 


.delVl/V2 


2 


50nM 


237-667'* 




3 


50nM 


492-527** 


gpl4 0 .mut . 


1 


50nM 


46-328** 


modUS4 .delVl 


2 


50nM 


82-318** 


/V2 


3 


50nM 


204-385** 



*A11 samples measured at T-75 flask stage unless otherwi 
indicated 

**at 24 well and 6 well plate stages 

***in a three liter bioreactor perfusion culture this 
clone yielded approximately 2-5 /zg/ml . 
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Table 6 



10 



CHO Cell Lines Expression Level of 

C ons t rue t s 

Constructs I CHO Clone # | MTX 

Level 



gpl20 .modSF162 l 



gpl4 0.modSF162 

gpl4 0.mut. 
modSF162 

gpl20.modSF162 
. delV2 

gpl4 0 .modSF162 
.delV2 



2 
3 
1 
2 



2 
3 



15 



gpi40.mut. | i 
modSFl 62 .del V2 | 2 

3 
4 



0 
0 
0 

20 nM 
20 nM 
20 nM 
20 nM 
800nM 
800nM 
800nM 

800nM 
800nM 
800nM 



800nM 
4 00nM 
800nM 
400nM 



SF162 Envelope 

Expression Level" 
(ng/ml) 



755 
928 
538 

180 

164 

188 

233 

528 

487 

589 

300 

200 

200 



2705 
1538 
1609 

350 

451 

487 

804 

1560 
1878 
1212 
600 
400 
500 



* A1 1 sam Ples measured at T-75 flask staqe 
indicated 3 



300-700 
1161 
400-600 
1600-2176 



unless otherwise 
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25 



The results presented above demonstrate the ability 
of the constructs of the present invention to provide 
expression of Env polypeptides in CHO cells. Production 
of polypeptides using CHO cells provides (i) correct 
glycosylate patterns and protein conformation (as 
determined by binding to panel of MAbs) ; (ii) correct 
binding^ to CD4 receptor mol ecules^ absence of non . 
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mammalian cell contaminants (e.g., insect viruses and/cr 
cells) ; and (iv) ease of purification. 

D." Tat Coding SeouenrPR 

The HIV-SF162 ("SF162") wild-type Tat (SEQ ID NO:85) 
sequences were cloned into expression vectors having the 
same features as the vectors into which the synthetic Tat 
sequences were cloned (SEQ ID NOs:87, 88 and 89). 

Expression efficiencies for various vectors carrying 
the SF162 wild-type and synthetic Tat sequences are 
evaluated essentially as described above for Gag and Env 
using capture ELISAs with the appropriate ant i -tat 
antibodies and/or CHO cell assays. Expression of the 
polypeptides encoded by the synthetic cassettes is 
improved relative to wild type. 

Example 3 
Western Riot- Malvsis nf ExDrsssinn 
Gag and Gaa-ProtP ase Pnriina Seq iiPn^g 
Human 293 cells were transfected as described in 
Example 2 with pCMV6a-based vectors containing native or 
synthetic Gag expression cassettes. Cells were 
cultivated for 60 hours post-transfection. Supernatants 
were prepared as described. Cell lysates were prepared 
as follows. The cells were washed once with phosphate- 
buffered saline, lysed with detergent [1% NP4 0 (Sigma 
Chemical Co., St. Louis, MO) in 0.1 M Tris-HCl, pH'7.5], 
and the lysate transferred into fresh tubes. SDS- 
polyacryl amide gels (pre-cast 8-16%; Novex, San Diego, 
CA) were loaded with 20 jil of supernatant or 12.5 /il of 
cell lysate. A protein standard was also loaded (5 M l, 
broad size range standard; BioRad Laboratories, Hercules, 
CA) . Electrophoresis was carried out and the proteins 
were transferred using a BioRad Transfer Chamber (BioRad 
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Laboratories, Hercules, CA) to Immobilon P membranes 
(Millipore Corp., Bedford, MA) using the transfer buffer 
recommended by the manufacturer (Millipore) , where the 
transfer was performed at 100 volts for 90 minutes. The 
membranes were exposed to HIV-l- P ositive human patient 
serum and immunostained using o-phenylenediamine 
dihydrochloride (OPD; Sigma) . 

The- results of the immunoblotting analysis showed 
that cells containing the synthetic Gag expression 
cassette. produced the expected p55 protein at higher per - 
cell concentrations than cells containing the native 
expression cassette. The GagpSS protein was seen in 
both cell lysates and supernatants . The levels of 
production were significantly higher in cell supernatants 
for cells transfected with the synthetic Gag expression 
cassette of the present invention. Experiments performed 
in support of the present invention suggest that cells 
containing the synthetic Gag-prot expression cassette - 
produced the expected Gag-prot protein at comparably 
higher per-cell concentrations than cells containing the 
native expression cassette. 

In addition, supernatants from the transfected 293 
cells were fractionated on sucrose gradients. Aliguots of 
the supernatant were transferred to Polyclear™ ultra- 
centrifuge tubes (Beckman Instruments, Columbia, MD) , 
under-laid with a solution of 20% (wt/wt) sucrose, and 
subjected to 2 hours centrif ugation at 28,000 rpm in a 
Beckman SW28 rotor. The resulting pellet was suspended 
m PBS and layered onto a 20-60% (wt/wt) sucrose gradient 
and subjected to 2 hours centrif ugation at 40,000 rpm in 
a Beckman SW41ti rotor. 

The gradient was then fractionated into 
approximately 10 x 1 ml aliquots (starting at the top, 
20%-end, of the gradient), samples were taken from 
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fractions 1-9 and were electrophoresed on 8-16% SDS 
polyacryl amide gels. Fraction number 4 (the peak 
fraction) corresponds to the expected density of Gag 
protein VLPs, The supernatants from 293/synthetic Gag 
cells gave much stronger p55 bands than supernatants from 
293/native Gag cells, and # as expected, the highest 
concentration of p55 in either supernatant was found in 
fraction 4. 

These results demonstrate that the synthetic Gag 
expression cassette provides superior production of both 
p55 protein and VLPs, relative to the native Gag coding 
sequences . 

Env Coding Sequences 

Human 293 cells were transfected as described in 
Example 2 with pCMVKm2- based; pCMVl ink -based; p-CMVII- 
based or pESN2 -based vectors containing native or 
synthetic Env expression cassettes. Cells were 
cultivated for 48 or 60 hours post-transf ection. Cell 
lysates and supernatants were prepared as described 
(Example 2). Briefly, the cells were washed once with 
phosphate-buffered saline, lysed with detergent [1% NP40 
(Sigma Chemical Co., St. Louis, MO) ] in 0.1 M Tris-HCl, 
pH 7.5], and the lysate transferred into fresh tubes. 
SDS-polyacrylamide gels (pre-cast 8-16%; Novex, San 
Diego, CA) were loaded with 20 /il of supernatant or 12.5 
lil of cell lysate. A protein molecular weight standard 
and an HIV SF2 gpl20 positive control protein (5 /zl, 
broad size range standard; BioRad Laboratories, Hercules, 
CA) were also loaded. Electrophoresis was carried out 
and the proteins were transferred using a BioRad Transfer 
Chamber (BioRad Laboratories, Hercules, CA) to Immobilon 
P membranes (Millipore Corp., Bedford, MA) using the 
transfer buffer recommended by the manufacturer 
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(Millipore) , where the transfer was performed at 100 
volts for 90 minutes. The membranes were then reacted 
against polyclonal goat anti-gpi20SF2/env2-3 anti-sera, 
followed by incubation with swine anti-goat IgG- 
peroxidase (POD) (Sigma, St. Louis, MO). Bands 
indicative of binding were visualized by adding DAB with 
hydrogen peroxide which deposits a brown precipitate on 
the membranes. 

The results of the immunoblotting analysis showed 
that cells containing the synthetic Env expression 
cassette produced the expected Env gp proteins of the 
predicted molecular weights as determined by mobilities 
in SDS-polyacrylamide gels at higher per-cell 
concentrations than cells containing the native 
expression cassette. The Env proteins were seen in both 
cell lysates and supernatants . The levels of production 
were significantly higher in cell supernatants for cells 
transfected with the synthetic Env expression cassette of 
the present invention. 

£L_ Tat Co ding Sequences 

Human 293 cells are transfected as described in 
Example 2 with various vectors containing native or 
synthetic Tat expression cassettes. Cells are cultivated 
and isolated proteins analyzed as described above. 
Immunoblotting analysis shows that cells containing the 
synthetic Tat expression cassette produced the expected 
Tat proteins of the predicted molecular weights as 
determined by mobilities in SDS-polyacrylamide gels at 
higher per-cell concentrations than cells containing the 
native expression cassette. 
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Example 4 
Purification of Env polypeptides 
A. Purificatio n of Oligomeric api4n 

Purification of oligomeric gpl40 (o-gpl40 US4) was 
conducted essentially as shown in Figure 60. For the 
experiments described herein, o-gpl40 refers to 
oligomeric gpl40 in either native or modified (e.g., 
optimized expression sequences, deleted, mutated, 
truncated, etc.) form. Briefly, concentrated (30-50X) 
supernatants obtained from CHO cell cultures were loaded " 
onto an anion exchange (DEAE) column which removed DNA 
and other serum proteins. The eluted material was loaded 
onto a ceramic hydroxyapatite column (CHAP) which bound 
serum proteins but not HIV Env proteins. The flow- 
through from the DEAE and CHAP columns was loaded onto a 
Protein A column as a precautionary step to remove any 
remaining serum immunoglobulins. The Env proteins in the 
flow- through were then captured using the lectin 
gluvanthus navalis (GNA, Vector Labs, Burlingame, CA) . 
GNA has high affinity for mannose rich carbohydrates such 
as Env. The Env proteins were then eluted with GNA 
substrate. To remove other highly glycosylated proteins, 
a cation exchange column (SP) was used to purify 
g P 140/gpl20. In a final step, which separates 9P 120 from. 
o-gpl40, a gel filtration column was used to separate 
oligomers from monomers. Sizing and chromatography 
analysis of the final product revealed that this strategy 
lead to the successful isolation of oligomeric gpl40. 

B. Puri fication of gpl?n 

Purification of gpl20 was conducted essentially as 
previously described for other Env proteins. Briefly, 
concentrated supernatants obtained from CHO cell cultures 
were loaded onto an anion exchange (DEAE) column which 
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removed DNA and other serum proteins . The eluted 
material was loaded onto a ceramic hydroxyapatite column 
(CHAP) which bound serum proteins but not HIV Env 
proteins. The flow- through from the CHAP column was 
loaded a cation exchange column (SP) where the flow- 
through was discarded and the bound fraction eluted with 
salt. The eluted fraction(s) were loaded onto a Suprose 
12/Superdex 200 Tandem column (Pharmacia-Upjohn, Uppsala, 
Sweden) from which purified gp!20 was obtained. Sizing 
and chromatography analysis of the final product revealed 
that this strategy successfully purified gpl20 proteins. 

Example 5 
Analysis nf Pnr- ifj eri Bnv p 0 i vr ,p p f j 
A. Ana lysis of o-gpi4n 

It is well documented that HIV Env protein binds to 
CD4 only i n its correct conformation. Accordingly, the 
ability of o-g P 140 US 4 polypeptides, produced and ' 
purified as described above, to bind CD4 cells was 
tested. O-gpl40 US4 was incubated for 15 minutes with 
FITC- labeled CD4 at room temperature and loaded onto a 
Biosil 250 (BioRad) size exclusion column using Waters 
HPLC. CD4-FITC has the longest retention time (2.67 
minutes), followed by CD4- FITC -gpl 20 (2.167 min). The 
shortest retention time (1.9 min) was observed for CD4- 
FITC-o-g P l40 US 4 indicating that, as expected, o-gpl40 
US4 binds to CD 4 forming a large complex which reduces 
retention time on the column. Thus, the o-gpl4 0 us 4 
produced and purified as described above is of the 
correct size and conformation. 

In addition, the US 4 o-g P l40, purified as described 
above, was also tested for its ability to bind to a 
variety of monoclonal antibodies with known epitope 
specificities for the CD4 binding site, the CD4 inducible 
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site, the V3 loop and oligomer-specif ic gp41 epitope. O- 
gp!4 0 bound strongly to these antibodies, indicating that 
the purified protein retains its structural integrity. 

B. Analysis of qpl20 

As described above, CD4-FITC binds gpl2 0, as 
demonstrated by the decreased retention time on the HPLC 
column. Thus, US4 gpl20 purified by the above method 
retains its conformational integrity. In addition, the 
properties of purified gpl20 can be tested by examining 
its integrity and identity on western blots, as well as, 
by examining protein concentration, pH, conductivity*, 
endotoxin levels, bioburden and the like. US 4 gpl20, 
purified as described above, was also tested for its 
ability to bind to a variety of monoclonal antibodies 
with known epitope specificities for the CD4 binding 
site, the CD 4 inducible site, the V3 loop and oligomer- 
specif ic gp41 epitope. The pattern of mAb binding to 
gpl20 indicated that the purified protein retained its 
structural integrity, for example, the purified gpl20 did 
not bind the mAb having the oligomer-specif ic gp41 
epitope (as expected) . 

Example 6 

Electron Microsco pic Evaluation of . VLP Production 
The cells for electron microscopy were plated at a 
density of 50-70% confluence, one day before 
transfection. The cells were transfected with 10 /ig of 
DNA using transfection reagent LT1 (Panvera) and 
incubated for 5 hours in serum- reduced medium (see 
Example 2) . The medium was then replaced with normal 
medium (see Example 2) and the cells were incubated for 
14 hours (COS-7) or 40 hours (CHO), After incubation the 
cells were washed twice with PBS and fixed with 2% 
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glutaraldehyde . Electron microscopy was performed by 
Prof. T.S. Benedict Yen, Veterans Affairs, Medical 
Center, San Francisco, CA) . 

Electron microscopy was carried out using a 
transmission electron microscope (Zeiss 10c) . The cells 
were pre- stained with osmium and stained with uranium 
acetate arid lead citrate. The magnification was 
100,000X. 

Figures 3A and 3B show micrographs of CHO cells 
transfected with pCMVKM2 carrying the synthetic Gag' 
expression cassette (SEQ ID NO: 5) or carrying the Gag- 
prot expression cassette (SEQ ID NO:79). In the figure, 
free and budding immature virus- like-particles (VLP) of 
the expected size (100 nm) are seen for the Gag 
expression cassette (Figure 3 A) and both immature and 
mature VLPs are seen for the Gag-prot expression cassette 
(Figure 3B) . COS- 7 cells transfected with the same 
vector have the same expression pattern. VLP can also be 
found intracellularly in CHO and COS- 7 cells. 

Native and synthetic Gag expression cassettes were 
compared for their associated levels of VLP production 
when used to transfect human 293 cells. The comparison 
was performed by density gradient ultracentrif ugation of 
cell supernatants and Western-blot analysis of the 
gradient fractions. There was a clear improvement in 
production of VLPs when using the synthetic Gag 
construct . 

Example 7 

Expression of Virus-lik^ Particles in the Baculovims 

System 

Express ion of Native HTV p55 Gaa 
To construct the native HIV p55 Gag baculovirus 
shuttle vector, the prototype SF2 HIV p55 plasmid, pTMl- 
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Gag (SelbyM.J., et al . , J Virol. 71 (10) : 7827-7831 , 
1997) , was digested with restriction endonucleases Ncol 
and BamHI to extract a l . 5 Kb fragment that was 
subsequently subcloned into pAcC4 {Bio/Technology 6:47- 
5 55, 1988) , a^ derivative of pAc436. Generation of the 

recombinant baculovirus was achieved by co- transf ecting 2 
pg of the HIV p55 Gag pAcC4 shuttle vector with 0.5 jig of 
linearized, Autographa californica baculovirus (AcNPV) 
wild-type viral DNA into Spodoptera frugiperda (Sf9) 

10 cells (Kitts, P. A., Ayres M.D., and Possee R.D., Nucleic 
Acids Res. 18:5667-5672, 1990). The isolation of 
recombinant virus expressing HIV p55 Gag was performed 
according to standard techniques (O'Reilly, D.R., L.K. 
Miller, and V. A. Luckow, Baculovirus Expression Vector: 

15 A Laboratory Manual, W.H. Freeman and Company, New York, 
1992) . 

Expression of the HIV p55 Gag was achieved using a 
500 ml suspension culture of Sf9 cells grown in serum- 
free medium (Miaorella, B., D. Inlow, A. Shauger, and D. 

20 Harano, Bio/Technology 6:1506-1510, 1988) that had been 
infected with the HIV p55 Gag recombinant baculovirus at 
a multiplicity of infection (MOD of 10. Forty-eight 
hours post -infection, the supernatant was separated by 
centrifugation and filtered through a 0.2 fim filter. 

25 Aliquots of the supernatant were then transferred to 
Polyclear™ (Beckman Instruments, Palo Alto, CA) 
ultracentrifuge tubes, underlaid with 20% (wt/wt) 
sucrose, and subjected to 2 hours centrifugation at 24,00 
rpm using a Beckman SW2 8 rotor. 

30 The resulting pellet was suspended in Tris buffer 

(20 mM Tris HCl , pH 7.5, 250 mM NaCl , and 2.5 mM 
ethylenediaminetetraacetic acid [EDTA] ) , layered onto a 
20-60% (wt/wt) sucrose gradient, and subjected to 2 hours 
centrifugation at 40,000 rpm using a Beckman SW41ti 
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rotor. The gradient was then fractionated starting at 
the top (20% sucrose) of the gradient into approximately 
twelve 0.75 ml aliquots. A sample of each fraction was 
electrophoresed on 8-16% SDS polyacrylamide gels and the 
resulting bands were visualized after commassie staining 
(Figure 4). Additional aliquots were subjected to 
refractive index analysis. 

The results shown in Figure 4 indicated that the P 55 
Gag virus-like particles banded at a sucrose density of 
range of 1.15 - i. 19 g/ml with the peak afc approximately 
1.17 g/ml. The peak fractions were pooled and 
concentrated by a second 20% sucrose pelleting. The 
resulting pellet was suspended in 1 ml of Tris buffer 
(described above) . The total protein yield as estimated 
by Bacimchrominic Acid (BCA) (Pi erce Chemical, Rockford, 
IL) was 1.6 mg. 

^ . Expression of R V nH-,ot -, ic HTV ^ q 

A baculovirus shuttle vector containing the 
synthetic p55 Gag sequence was constructed as follows 
The synthetic HIV p 5 5 expression cassette (Example 1) was 
digested with restriction enzyme Sail followed by 
incubation with T4-DNA polymerase. The resulting 
fragment was isolated ( PC R Clean-Up™, Promega, Madison, 
WD and then digested with BamHI endonuclease . The 
shuttle vector P AcCl3 (Munemitsu S. ( et al . , Mol Cell 
Biol. 10(11, =5977-5982, 1990) was linearized by digestion 
with Serf, followed by incubation with T4-DNA polymerase 
and then isolated ( PCR Clean-Up-) . The linearized vector 
was digested with BamHI , treated with alkaline 
Phosphatase, and isolated by size fragmentation in an 
agarose gel. The isolated 1 . 5 kb fragment was ligated 
with the prepared P AcC13 vector. The resulting clone was 
designated P AcC13-Modif .p55Gag . 
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The expression conditions for the synthetic HIV p55 
VLPs differed from those of the native p55 Gag as 
follows: a culture volume of 1 liter used instead of 500 
ml; Trichoplusia ni (Tn5) (Wickham, T.J., and Nermerow, 
G.R., BioTechnology Progress, 9:25-30, 1993) insect cells 
were used instead of Sf9 insect cells; and, an MOI of 3 
was instead of an MOI of 10. Experiments performed in 
support of the present invention showed that there was no 
appreciable difference in expression level between the 
Sf9 and Tn5 insect cells with the native p55 clone. In 
terms of MOI, experience with the native p55 clone 
suggested that an MOI of 10 resulted in higher expression 
(approximately 2-fold) of VLPs than a lower MOI. 

The sucrose pelleting and banding methods used for 
the synthetic p55 VLPs were similar to those employed for 
the native p55 VLPs (described above) , with the fallowing 
exceptions: pelleted VLPs were suspended in 4 ml of 
phosphate buffered saline (PBS) instead of 1 . 0 ml of the 
Tris buffer; and four, 20-60% sucrose gradients were used 
instead of a single gradient. Also, due to the high 
concentration of banded VLPs, further concentration by 
pelleting was not required. The peak fractions from all 
4 gradients were simply dialyzed against PBS. The 
approximate density of the banded VLPs ranged from 1.23- 
1.28 g/ml. A total protein yield as estimated by BCA was 
4 6 mg. Results from the sucrose gradient' banding of the 
synthetic p55 are shown in Figure 5, 

A comparison of the total amount of purified HIV p55 
Gag from several preparations obtained from the two 
baculovirus expression cassettes has been summarized in 
Figure 6. The average yield from the native p55 was 3.16. 
mg/liter of culture (n=5, standard deviation (sd) ±1.07, 
range = 1.8-4.8 mg/L) whereas the average yield from the 
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synthetic- P 55 was more than ten-fold higher at 44. 5 
mg/liter of culture (n=2, sd=±6.4). 

In addition to a higher total protein yield, the 
final product from the synthetic P 55-expressed Gag 
consistently contained lower amounts of contaminating 
baculovirus proteins than the final product from the 
native P 55-expressed Gag. This difference can be seen in 
the two commassie-stained gels Figures 4 and 5. 

— , Expression of WaHv» an d fivnfh»tic Cta«-r,v~, 

Expression of the HIV P 55 Gag/HCV Core 173 (SEQ ID 
NO: 8) was achieved using a 2.5 liter suspension culture 
of Sf9 cells grown in serum-free medium (Miaorella, B., 
D. Inlow, A. Shauger, and D. Harano. 1988 Bio/Technology 
6:1506-1510). The cells were infected with an HIV p 55 
Gag/HCV Core 173 recombinant baculovirus. Forty-eight 
hours post-infection, the supernatant was separated from 
the cells by centrif ugation and filtered through a 0 . 2 M m 
filter. Aliquots of the supernatant were then 
transferred to a Polyclear™ (Beckman Instruments, Palo 
Alto, CA) ultracentrifuge tubes containing 3 0% (wt/wt) 
sucrose, and subjected to 2 hours of centrif ugation at 
24,000 rpm in a Beckman SW28 rotor and ultracentrifuge. 

The resulting pellet was suspended in Tris buffer 
(50 mM Tris-HCl, pH 7.5, 500 mM Nad) and layered onto a 
30-60% (wt/wt) sucrose gradient and subjected to 2 hours 
centrifugation at 40,000 rpm in a Beckman SW41ti rotor 
and ultracentrifuge. The gradient was then fractionated 
starting at the top (30%) of the gradient into 
approximately 11 x 1.0 ml aliquots. A sample of each 
fraction was electrophoresed on 8-16% SDS polyacrylamide 
9els and the resulting bands were visualized after 
commassie staining. 
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A subset of aliquots were also subjected to Western 
blot analysis using monoclonal antibody 76C.5EG (Steimer, 
K.S., et al., Virology 150:283-290, 1986) which is 
specific for HIV p24 (a subunit of HIV p55) . The peak 
5 fractions from the sucrose gradient were pooled and 
concentrated by a second 20% sucrose pelleting. The 
resulting pellet was suspended in 1 ml of buffer Tris 
buffer and the total protein yield as estimated by BCA 
(Pierce Chemical, Rockford, IL) was - 1.0 mg. 
10 The results from the SDS PAGE are shown in Figure 8 

and the anti- p24 Western blot results are shown in 
Figure 9. Taken together, these results indicate that 
the HIV P 55 Gag/HCV Core 173 chimeric VLPs banded at a 
sucrose density similar to that of the HIV p55 Gag VLPs 
and the visible protein band that migrated at a molecular 
weight of ~ 72,000 kd was reactive with the HIV p24- 
specific monoclonal antibody. An additional 
immunoreactive band at approximately 55,000 kd also 
appeared to be reactive with theanti-p24 antibody and 
may be a degradation product. 

Although aliquots from the above preparation were 
not tested for reactivity with an HCV Core-specific 
antibody (an anti-CD22 rabbit serum) , results from a 
similar preparation are shown in Figure 10 and indicate 
that the main HCV Core-specific reactivity migrates at an 
approximate molecular weight of 72,000 kd which is in 
accordance with the predicted molecular weight of the 
chimeric protein. 

The expression conditions for the synthetic HIV p55 
Gag/HCV Core 173 (SEQ ID NO: 8) VLPs differed from those 
of the native p55 Gag and are as follows: a culture 
volume of 1 liter used instead of 2 . 5 liters, 
Trichoplusia ni (Tn5) (Wickham, T.J. , and Nemerow, G.R. 
1993 BioTechnology Progress, 9:25-30) insect cells were 
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used instead of Sf9 insect cells and an MOI of 3 was 
instead of an MOI of 10. The sucrose pelleting and 
banding methods used for the synthetic HIV p55 Gag/HCV 
Core 173 VLPs were similar to those employed for the 
native HIV p55 Gag/HCV Core 173 VLPs . However, 
differences included: pelleted VLPs were suspended in 1 
ml of phosphate buffered saline (PBS) instead of 1.0 ml 
of the Tris buffer, and a single 20-60% sucrose gradients 
was used. A comparison of the total amount of purified 
HIV P 55 Gag/HCV Core 173 from multiple preparations 
obtained from the two baculovirus expression cassettes 
showed that there was an increase in expression using the 
synthetic HIV p55 Gag/HCV Core 173 cassette. 

15 ^ Alternative me thod for t he R nr^w. t of Hfv p gg 
Gag VLPs 

In addition to purification from the media, p55 (Gag 
protein) expressed in baculovirus (e.g., using a 
synthetic expression cassette of the present invention) 
can also be purified as virus-like particles from the 
infected insect cells. For example, forty-eight hours 
post infection, the media and cell pellet are separated 
by centrifugation and the cell pellet is stored at -70°C 
until future use. At the 

time of processing, the cell pellet is suspended in 5 
volumes of hypotonic lysis buffer (20 mM Tris-HCl, pH 
8.2, 1 mM EGTA; 1 mM MgCl , and Complete Protease 
Inhibitor® (Boehringer Mannheim Corp., Indianapolis, 
IN]). If needed, the cells are then dounced 8-10 times 
to complete cell lysis. 

The lysate is then centrifuged at approximately 
1000-1500 x g for 20 minutes. The supernatant is 
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decanted into UltraClear™ tubes, underlayed with 20% 
sucrose (w/w) and 

centrifuged at 24,000 rpm in SW28 buckets for 2 hours. 
The resulting pellet .is suspended in Tris buffer (20 mM 
Tris HC1, pH 7.5, 250 mM NaCl, and 2 . 5 mM ethylene- 
diamine-tetraacetic acid (EDTA) with 0.1% IGEPAL 
detergent (Sigma 

Chemical, St. Louis, MO) and 250 units/ml of benzonase 
(American International Chemical, inc., Natick, MA) and 
incubated at 4°C for at least 30 minutes. The 
suspension is subsequently layered onto a 20-60% sucrose 
gradient and spun at 40,000 rpm using an SW41ti rotor for 
20-24 hours. 

After ultracentrifugation, the sucrose gradient is 
15 fractionated and aliquots run on SDS PAGE to identify 

peak fractions. The peak fractions are dialyzed against 
PBS and 

measured for protein content. Negatively stained 
electron mircographs typically show non-enveloped VLPs 
20 somewhat smaller in diameter (80-120 nm) than the budded 
VLPs. HIV Gag VLPs prepared in this manner are also 
capable of generating Gag-specific CTL responses in mice. 

Example 8 

25 In vivo Immunogenic! tv of Synthetic Gaa Expression 

Cassettes 

ZL. Immunization 

To evaluate the possibly improved immunogenic! ty of 
the synthetic Gag expression cassettes, a mouse study was 
performed. The plasmid DNA, pCMVKM2 carrying the 
synthetic Gag expression cassette, was diluted to the 
following final concentrations in a total injection 
volume of loo pi.- 20 M g, 2 M g, 0.2 m , and 0.02 M g. To 
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overcome possible negative dilution effects of the 
diluted DNA, the total DNA concentration in each sample 
was brought up to 20 „ g usin g the vector ( P CMVKM2) alone 
As a control, plasmid DNA of the native Gag expression 
cassette was handled in the same manner. Twelve groups 
of four Balb/c mice (Charles River, Boston, MA) were 
intramuscularly immunized (50 M l per leg , intramuscular 
injection into the tibialis anterior) according to the 
schedule in Table 7. 

Table 7 




1 - initial immunization at "week 0™ 

Groups 1-4 were bled at week 0 (before 
immunization), week 4, week 6, week 8, and week 12 
Groups 5-12 were bled at week 0 (before immunization) and 
at week 4 . 
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IL. Humoral Immune Resp nnsp 

The humoral immune response was checked with an 
anti-HIV Gag antibody ELISAs (enzyme -linked immunosorbent 
assays) of the mice sera 0 and 4 weeks post immunization 
(groups 5-12) and, in addition, 6 and 8 weeks post 
immunization, respectively, 2 and 4 weeks post second 
immunization (groups 1-4) . 

The antibody titers of the sera were determined by 
anti-Gag antibody ELISA. Briefly, sera from immunized 
mice were screened for antibodies directed against the 
HIV p55 Gag protein. ELISA microtiter plates were coated 
with 0.2 fig of HIV-1 SF2 P 24-Gag protein per well overnight 
and washed four times; subsequently, blocking was done 
with PBS-0.2% Tween (Sigma) for 2 hours. After removal 
of the blocking solution, 100 /il of diluted mouse serum 
was added. Sera were tested at 1/25 dilutions and by 
serial 3 -fold dilutions, thereafter. Microtiter plates 
were washed four times and incubated with a secondary, 
peroxidase-coupled anti -mouse IgG antibody (Pierce, 
Rockford, IL) . ELISA plates were washed and 100 M l of 3, 
3', 5, 5' -tetramethyl benzidine (TMB; Pierce) was added 
per well. The optical density of each well was measured 
after 15 minutes. The titers reported are the reciprocal 
of the dilution of serum that gave a half-maximum optical 
density (O.D.). The ELISA results are presented in Table 
8 . ,.. 



157 



WO 00/39302 



PCT/US99/3I245 




1 - synthetic gag expression cassette (SEQ ID NO-' 4) 
<i - native gag expression cassette (SEQ ID NO- 1) 

3 = geometric mean antibody titer 

4 = not applicable 

The results of the mouse immunizations with plasmid- 
DNAs show that the synthetic expression cassettes provide 
a clear improvement of immunogenic ty relative to the 
native expression cassettes. Also, the second boost 
immunization induced a secondary immune response after 
two weeks (groups 1-3) . 



Cellular Immune* Respond 

The frequency of specific cytotoxic T- lymphocytes 
(CTL) was evaluated by a standard chromium release assay 
of peptide pulsed Balb/c mouse CD4 cells. Gag expressing 
vaccinia virus infected CD-8 cells were used as a 
positive control (wGag) . Briefly, sp l e en cells 
(Effector cells, E) were obtained from the BALB/c mice 
immunized as described above (Table 8) were cultured 
restimulated, and assayed for CTL activity against Gag 
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peptide-pulsed target cells as described (Doe, B., and 
Walker, CM., AIDS 10 (7) - 793-794 , 1996). The HIV-1 SF2 Gag 
peptide used" was p7g SEQ ID NO: 10. Cytotoxic activity was 
measured in a standard 51 Cr release assay.. Target (T) 
5 cells were cultured with effector (E) cells at various 

E:T ratios for 4 hours and the average cpm from duplicate 
wells was used to calculate percent specific 51 Cr release. 
The results are presented in Table 9. 

Cytotoxic T-cell (CTL) activity was measured in 
10 splenocytes .recovered from the mice immunized with HIV 
Gag DNA (compare Effector column, Table 9, to 
immunization schedule, Table 8). Effector cells from the 
Gag DNA- immunized animals exhibited specific lysis of Gag 
p7g peptide-pulsed SV-BALB (MHC matched) targets cells 
15 indicative of a CTL response. Target cells that were 
peptide-pulsed and derived from an MHC-unmatched mouse 
strain (MC57) were not lysed (Table 9; MC/p7g) . 
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Table 9 



10 



15 



20 



9 



Cytotoxic t- lymphocyte (CTL) response" 



s in 



25 



mice immunized with HIV-l 



Immunization 



gagmod 



30:1 
10:1 



^9 



gagmod 



20 fig DNA 



100:1 

30:1 

10:1 
100:1 



gag native | 30 : 1 

10:1 



fig DNA 



100:1 



gag native | 30:1 

10:1 



tig DNA 100:1 
gag native | 30:1 

10:1 



100:1 



DNA 



Percent specific lysis of 



SVBALB 



target cells' 



none 


20 fig DNA 


1 100:1 


2 


gagmod 


30:1 


3 




10:1 


<1 1 


2 fig DNA 


100:1 


2 


1 gagmod 


30:1 


2 




10:1 . 


<1 


0.2 ptg DNA 


100:1 


2 I 



SVBALB 
p7g 



49 
30 
14 



37 
21 
13 



32 
25 
14 



17 



<1 



RMA 
p7g 



<1 
<1 
<1 



<1 
<1 
<1 



<1 
<1 
<1 



<1 



1 16 


1 <:L 


I 8 


1 <:L 


I 49 


| <1 


1 24 


1 <:L 


1 12 


1 <X 


1 18 


J <1 


1 14 
I 7 


I <1 

<1 


1 30 


<1 


1 17 


<1 


1 7 1 


<1 J 


2 


<1 


2 


<1 


2 


<1 1 



gag native I 30 : 1 

10:1 

3 re ~— — ^ «-vvu animals per DNA -dose ; 
posative CTL responses are indicated by boxed data 

The results of the CTL assays show increased potency 
of synthetic Gag expression cassettes for induction of 
cytotoxic T-lymphocyte (CTL) responses by DNA immunization. 
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Example 9 

In viyo Immu nization with Env polypeptides 
A. Immunogenicitv Study of US4 o-gpl40 in Ras-3c Adjuvant ' 
System 

Studies have been conducted using rabbits immunized 
with US 4 o-gpl40 purified as described above. Studies are 
also underway in animals to determine immunogenicity of US 4 
gpl20, SF162 o-gpl40 and SF162 gpl20. 

Two rabbits (#1 and #2) were immunized intramuscularly 
at 0, 4, 12 and 24 weeks with 50 /-tg of US 4 o-gpl40 in the 
Ribi™ adjuvant system <RAS-3c) , {Ribi Immunochem, Hamilton, 
MT) containing 2% Squalene, 0.2% Tween 80, and one or more 
bacterial cell wall components from the group consisting of 
monophosphorylipid A (MPL, Ribi Immunochem, Hamilton, MT) . 
In each experiment described herein, o-gpl40 can be native, 
mutated and/or modified. Antibody responses directed 
against the US 4 o-gpl40 protein were measured by ELISA. 
Results are shown in Table 10. 
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Table 10 



Rabbit/sample 



Approximate o-gpl40 ELISA 
titer 



pre -immun ization 
#1: postl (0 week i mmuni z ) 
#1: post2 (4 week im muni z ) 
#1: post3 (12 week 
immuni z) 
#1: post4 (24 week immuiz) 



#2: postl (o week immuni z ) 
#2; post2 (4 week immuni z ) 
#2; post3 (12 week 
immuni z) 
#2: post4 (24 week immuiz) 



0 
400 
15, 000 
50, 000 

100, 000 



600 
12, 000 
25, 000 

55, 000 



The avidities of antibodies directed against the US 4 
o-gpi4 0 protein were measured in a similar ELISA format 
employing successive washes with increasing concentrations 
of ammonium isothiocynate . Results are shown in Table 11 



Time of sample 

pre - imm uni za t ion 
postl (0 week immuni z) 



pos t2 (4 we ek immuni z) 
post 3 (12 week 
immuni z) 



post4 (24 week 
immuni z) 



Approx. Antibody avidity 
(NH.HCN Cone, in M) 

0.02 
1.8 



3.5 
5.5 



5.1 
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These results show that US 4 o-gpl40 is highly 
immunogenic and able to induce substantial antibody 
responses after only one or two immunizations. 

B. Immunoqenicitv of us 4 o -gpl40 in MF59-based Adiuvanfg 

Groups of 4 rabbits were immunized intramuscularly at 
0, 4, 12 and 24 weeks with various doses of US 4 o-gpl4 0 
protein in three different MF59-based adjuvants (MF59 is 
described in International Publication No. WO 90/14837 and 
typically contains 5% Squalene, 0.5% Tween 80, and 0.5% 
Span 85) . Antibody titers were measured post-third by ELISA 
using SF2 gpl20 to coat the plates. QHC is a guill-based 
adjuvant {Iscotek, Uppsala, Sweden). Results are shown in 
Table 12. 





Table 12 




Antigen dose (jig) 


Adjuvant 


Anti-gpl20 SP2 Ab GMT* 




12.5 


MF59 


7231 


20 


25 


MF59 


8896 




50 


MF59 


12822 




12.5 


MF59/MPL 


24146 




25 


MF59/MPL 


27199 




50 


MF59/MPL 


23059 


25 


50 


MF59/MPL/QHC 


31759 



Thus, adjuvanted o-gpl40 generated antigen-specific 
antibodies. Further, the antibodies were shown to 
increased in avidity over time. 

C. Neut ralizing Antihorfi <=>g 

Neutralizing antibodies post-third immunization were 
measured against HIV-1 SF2 in a T-cell line adapted virus 
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(TCLA) assay and against PBMC-grown HIV-1 variants SF2, 
SF162 and 119 using the CCR5+ CEMxl74 LTR-GFP reporter cell 
line, 5.25 (provided by N. Landau, Salk Institute, San 
Diego, CA) as target cells. Results are shown in Table 13. 

Table 13 

Neutralizing antibody responses in rabbits immunized 
with o-gpi4 0.modUS4 protein 



10 



15 



20 



25 



30 



Animal 



o-gpl40/ 

Ras-3c 
50 mg 

Experiment 2 

o-gpl40/ 

MF59 
50 mg 



o-gpl40/ 

MF5 9 + MPL 
50 mg 



o-gpl4 0/MF59 

+ MPL + QHC 
50 mg 



217 



218 



792 



809 
810 
811 



>640 



>640 



45 



496 
>640 
92 



100% 



96 



71 



100 
101 

92 



SF162 
PBMC* 



49 



37 



39 



44 
27 
24 



inhibition) 



17 



29 



26 



793 


50 j 


87 


J 26 


4 


794 


59 J 


87 


1 13 


0 


795 


128 J 


92 


I 15 


0_ 


804 ] 


173 T 


91 


1 ^7 | 


1 18 



805 


1 134 1 


93 


J 28 


4 


806 


J N.D. ** 1 


95 


1 49 


13 


807 


1 441 I 


100 


I 31 


[ 15 


808 


1 ^ 1 


98 


1 46 


40 



39 



35 



*Tl"1jA neutralizing antibody titers (50% 
**Not Determined 
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The above studies in rabbits indicate that the US4 o- 
gpl4 0 protein is highly immunogenic. When administered with 
adjuvant, this protein was able to induce substantial 
antibody responses after only one or two immunizations. 
Moreover, the adjuvanted o-gpl40 protein was able to 
generate antigen-specific antibodies which increased in 
avidity after successive immunizations, and substantial 
neutralizing activity against T-cell line adapted HIV-1. 
Neutralizing activity was also observed against PBMC-grown 
primary HIV strains, including the difficult to neutralize 
CCR5 co-receptor (R5 ) -utilizing isolates, SF162 and 119. 

Example 10 

In Vivo I mmunoqenicity of Synthetic Env Expression 

Cassettes 

General Immunization Methods 

To evaluate the immunogenicity of the synthetic Env 
expression cassettes, studies using guinea pigs, rabbits, 
mice, rhesus macaques and baboons were performed. The 
studies were structured as follows: DNA immunization alone 
(single or multiple) ; DNA immunization followed by protein 
immunization (boost) ; DNA immunization followed by Sindbis 
particle immunization; immunization by Sindbis particles 
alone. 

B . Humoral Immune Response . 

The humoral immune response was checked in serum 
specimens from immunized animals with an ant i -HIV Env 
antibody ELISAs (enzyme -linked immunosorbent assays) at 
various times post - immunization . The antibody titers of 
the sera were determined by anti-Env antibody ELISA as 
described above. Briefly, sera from immunized animals were 
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screened for antibodifes directed against the HIV gpi 2 0 or 
gpl40 Env protein. Wells of ELISA microtiter plates were 
coated 

overnight with the selected Env protein and washed four 
times; subsequently, blocking was done with PBS-0.2% Tween 
(Sigma) for 2 hours. After removal of the blocking 
solution, 100 „1 of diluted mouse serum was added. Sera 
were tested at 1/25 dilutions and by serial 3-fold 
dilutions, thereafter. Microtiter plates were washed four 
times and incubated with a secondary, peroxidase-coupled 
anti-mouse IgG antibody (Pierce, Rockford, IL) . ELISA 
plates were washed and 100 M l of 3, 3', 5, 5 • - tetramethyl 
benzidine (TMB; Pierce) was added per well. The optical 
density of each well was measured after 15 minutes. Titers 
are typically reported as the reciprocal of the dilution of 
serum that gave a half-maximum optical density (O.D.). 



Example 1 i 

DNA- immunisation of Raboons TT^-in^ g ynthPt-i n ^ 

Expressi on Caaspn-P« 

A . Baboons 

Pour baboons were immunized 3 times (weeks 0, 4 and 8) 
bilaterally, intramuscular into the quadriceps using Img 
pCMVKM2.GagMod.SF2 plasmid-DNA (Example 1). The animals 
were bled two weeks after each immunization and a p24 
antibody ELISA was performed with isolated plasma. The 
ELISA was performed essentially as described in Example 5 
except the second antibody- conjugate was an anti-human IgG 
g-chain specific, peroxidase conjugate ( Sigma chemical Co ' 
St. Louis, MD 63178) used at a dilution of 1:500. Fifty 
pg/ml yeast extract was added to the dilutions of plasma 
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samples and antibody conjugate to reduce non-specific 
background due to 

preexisting yeast antibodies in the baboons. The antibody 
titer results are presented in Table 14 . 



5 




Table 


14 








Immunizatil Weeks 


1 Antigen 1 


wpi 


V Baboon No. 


Ab-titer b 




on no. 1 


' 1 










1 0 


gagmod 




0 w/219 


< 10 






DNA 




0 w/220 


< 10 


10 








0 w/221 


< 10 










0 w/222 


< 10 




o 




2 


wp lst/219 


< 10 








2 


wp lst/220 


< 10 








2 


wp lst/221 


< 10 


15 






2 


wp lst/222 


15 




4 14 


gagmod 


2 


wp 4th/219 


< 10 






DNA 


2 


wp 4th/220 


88 








2 


wp 4th/221 


< 10 








2 


wp 4th/222 


56 


20 


5 30 


gagmod 


2 


wp 5th/219 


< 10 






DNA 


2 


wp 5th/220 


391 








2 


wp 5th/221 


237 








2 


wp 5th/222 


222 


25 


6 46 


gag VLP 


2 


wp 6th/219 


753 




protein 


2 


wp 6th/219 


4330 








2 


wp 6th/219 


5000 








2 


wp 6th/219 


2881 



a wpi = weeks post immunization 
b geometric mean antibody titer 

30 



In Table 14, pre-bleed data are given as 
Immunization No. 0; data for bleeds taken 2 weeks post- 
first immunization are given as Immunization No. 1; data 
for bleeds taken 2 weeks post-second immunization are 
given as Immunization No. 2; and, data for bleeds taken 2 
weeks post -third immunization are given as Immunization 
No. 3. 
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Further, lymphoprol iterative responses to p24 
antigen were also observed in baboons 221 and 222 two 
weeks post-fourth immunization (at week 14), and enhanced 
substantially post-boosting with VLP (at week 44 and 76). 
Such proliferation results are indicative of induction of 
T-helper cell functions. 



B. Rhesus Macaques 

The improved potency of the codon- modified gag 
expression plasmid observed in mouse and baboon studies 
was confirmed in rhesus macaques. Four of four macaques 
had detectable Gag-specific CTL after two or three 1 mg 
doses of modified gag plasmid. in contrast, in a 
previous study, only one of four macaques given 1 mg 
doses of plasmid-DNA encoding the wild- type HIV-l SF2 Gag 
showed strong CTL activity that was not apparent until 
after the seventh immunization. Further evidence of the 
potency of the modified gag plasmid was the observation 
that CTL from two of the four rhesus macaques reacted 
with three nonoverlapping Gag peptide pools, suggesting 
that as many as three different Gag peptides are 
recognized and indicating that the CTL response is 
polyclonal. Additional quantification and specificity 
studies are in progress to further characterize the T 
cell responses to Gag in the plasmid- immunized rhesus 
macaques. DNA immunization of macaques with the modified 
gag plasmid did not result in significant antibody 
responses, with only two of four animals seroconverting 
at low titers. In contrast, in the same study the 
majority of macaques in groups immunized with P 55Gag 
protein seroconverted and had strong Gag-specific 
antibody titers. These data suggest that a prime-boost 
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strategy (DNA-prime and protein-boost) could be very 
promising for the induction of a strong CTL and antibody 
response . 

In sum, these results demonstrate that the synthetic 
5 Gag plasmid DNA is immunogenic in non-human primates. 

When similar experiments were carried out using wild-type 
Gag plasmid DNA no such induction of anti-p24 antibodies 
was observed after four . immunizations . 

10 Example 12 

DNA- and Protein Immunizations of Animals Using Env 
Expression Cassettes and Polypeptides 

A. Guinea Pigs 

Groups comprising six guinea pigs each were 
15 immunized intramuscularly at 0, 4, and 12 weeks with 
plasmid DNAs encoding the gpl20 .modUS4 , gpl40 . modUS4 , 
gpl4 0 .modUS4 .del VI, gpl4 0 .modUS4 .delV2 , 

gpl40.modUS4.delVl/V2, or gpl60.modUS4 coding sequences 
of the US4 -derived Env. The animals were subsequently 

20 boosted at 18 weeks with a single intramuscular dose of 
US4 o-gpl40 .mut .modUS4 protein in MF59 adjuvant. Anti- 
gp!20 SF2 antibody titers (geometric mean titers) were 
measured at two weeks following the third DNA 
immunization^ aiic|.^3^;, two* weeks after the protein boost. 

25 Results are shown in Table 15. 
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Table 15 



Group 


GMT post-DNA 
J. rnrnun ± z . 


GMT post -protein 
boost 


gpl20 .modUS4 


2098 


9489 


gpl4 0.modUS4 


190 


5340 


gpl4 0.mpdUS4 .del VI 
gpl4 0 . modUS4 . del V2 


341 
386 


7808 
8165 


gpl4 0 .modUS4 .del VI /V 
2 


664 


8270 


gpl60 .modUS4 


235 


9928 



These results demonstrate the usefulness of the 
synthetic constructs to generate immune responses, as 
well as, the advantage of providing a protein boost to 
enhance the immune response following DNA immunization. 



B. Rabbi tB 

Rabbits were immunized intramuscularly and 
intradermal^ using a Bioject needless syringe with 
plasmid DNAs encoding the following synthetic SF162 Env 
polypeptides: gpl2 0 . modSF162 , gpl20 . modSF162 . delV2 , 
gpl4 0.modSF162, gpl40 .modSF162 . delV2 , g P 140 .mut .modSF162 , 
g P 14 0.mut.modSF162.delV2, gpl60 .modSF162 , and 
g P 160.modSF162.delV2. Approximately 1 mg of plasmid DNA 
(pCMVlink) carrying the synthetic Env expression cassette 
was used to immunize the rabbits. . Rabbits were immunized 
with plasmid DNA at 0, 4, and 12 weeks. At two weeks 
after the third immunization all of the constructs were 
shown to have generated significant antibody titers in 
the test animals. Further, rabbits immunized with 
constructs containing deletions of the V2 region 
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generally generated similar antibody titers relative to 
rabbits immunized with the companion construct still 
containing the V2 region. 

The nucleic acid immunizations are followed by 
protein boosting with o-gpl40 .modSF162 . delV2 (0.1 mg of 
purified protein) at 24 weeks after the initial 
immunization. Results are shown in Table 16. 

Table 16 



Group 


GMT 2wks 
post-2nd DNA 
immunization 


GMT 2wks 
post-3rd DMA 
immunization 


GMT 2wks 
post -protein 
boost 


gpl2 0.modSF162 


4573 


5899 


26033 


gpl20 .modSF162 .delV2 


3811 


3122 


29606 


gpl4 0.modSF162 


1478 


710 


12882 


gpl4 0 . modSF162 . delV2 


1572 


819 


11067 


gpl4 0 . mut . modSF162 


1417 


788 


8827 


gpl4 0.mut.modSF162.delV 
2 


1378 


1207 


13301 


gpl60.modSF162 


23 


81 


7050 


gpl60.modSF162 .delV2 


85 


459 


11568 



All constructs are highly immunogenic and generate 
substantial antigen binding antibody responses after only 
2 immunizations in rabbits. 

C . Baboons 

Groups of four baboons were immunized 
intramuscularly with 1 mg doses of DNA encoding different 
forms of synthetic US 4 gpl40 (see the following table) at 
0, 4, 8, 12, 28, and 44 weeks. The animals were also 
boosted twice with US 4 0-gpl40 protein (gpl4 0 .mut .modUS4 ) 
at 44 and 76 weeks using MF59 as adjuvant. Results are 
shown in Table 17. 
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Table 27 



Animal 



Treatment 



CY 215 

CY 216 

CY 217 

CY 218 
Gedmean 



CY 219 
CY 220 
CY 221 

CY 222 
Geomean; 
CY 223 

CY 224 
CY 225 
CY 22 6 

Geomean: 

CY 227 

CY 228 



gpl4 0.modUS4 



gpl4 0.modUS4 
+ p55gag.SF2 



gpl40.mut 
modUS4 



gpl4 0TM. 



2 Wks Post 

5th DNA 
immuniza- 
tion 

8.3 

8,3 

68 

101 
26.2 



8.3 

8.3 

8.3 

8.3 
8.3 
41.4 

8.3 

135 

47 

68.3 

8.3 

8.3 



2 Wks post 
6th DNA 
(plus o- 
gpl4 0 prot . 
immuniz . ) 

446 

433 

1660 

2556 
951.4 



8.3 
8.3 
954 
71 
46.5 
10497 

979 

2935 

1209 
2457.4 
56 

806 



J CY 229 


1 modUS 4 


8.3 


48 


3402 , 


CY 230 




I 8.3 










1 38 


6520 


j GMT* : 




1 8.3 










1 95 • 3 


3375.3 


*GMT 


= geometric mean 


titer 







2 Wks post 
7th DNA 
(o-gpl40 
protein 
only) 

1813 

1236 

2989 

1610 
1812.1 



421 

3117 

871 

916 
1011.5 
46432 

470 

3870 

4009 
4289.6 
5001 

1170 



The results in Table 17 demonstrate the usefulness 
of the synthetic constructs to generate immune responses 
« primates such as baboons. In addition, all animals 
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showed evidence of antigen-specific (Env antigen) 
lymphoprolif erative responses. 

D. Rhesus Macaques 

Two rhesus macaques (designated H445 and J408) were 
immunized with 1 mg of DNA encoding SF162 gpl4 0 with a 
deleted V2 region (SF162 .gpl40 . delV2 ) by intramuscular 
(IM) and intradermal (ID) routes at 0, 4, 8, and 28 
weeks. Approximately 100 fig of the protein encoded by 
the SF162. gpl40mut.delV2 construct was also administered 
in MF59 by IM delivery at 28 weeks. 

ELISA titers are shown in Figure 61. Neutralizing 
antibody activity is shown Tables 18 and 19. 
Neutralizing antibody activity was determined against a 
variety of primary HIV-1 isolates in a primary lymphocyte 
or "PBMC-based" assay (see the following tables). 
Further, the phenotypic co-receptor usage for each of the 
primary isolates is indicated. As can be seen in the 
tables neutralizing antibodies were detected against 
every isolate tested, including the HIV-1 primary 
isolates (i.e, , SF128A, 92US660, 92HT593, 92US657, 
92US714, 91US056, and 91US054). 
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Table 16 



Treatment 



10 



15 



20 



25 



Animal 

EO 456 
EO 457 
EO 458 
EO 459 
EO 460 
EO 461 
EO 462 
EO 463 
EO 464 
EO 465 

EO 466 
EO 467 
EO 468 
EO 469 
EO 470 
EO 471 
EO 472 
EO 473 
EO 474 
EO 475 

EO 476 
EO 477 
EO 4 78 
EO 4 79 
EO 4 80 



Bleed 
0 



1st 

Immunization 



25pg 120mod 
DNA 



25fig 120mod 
DNA 



50pg 120mod 
DNA 



BOfig 120mod 
DNA 



25//g 120mod 
DNA 



2nd 

Immunization 



(None) 



25/ig 120mod 
DNA 



(None) 



50/xg I20mod 
DNA 



Sindbis/Env 



1st 
Imm'n 

8.3 
8.3 
8.3 
8.3 
8.3 

8.3 
8.3 
8.3 
8.3 
8.3 

8.3 
8.3 
8.3 
8.3 
8.3 

6.3 
8.3 
8.3 
8.3 
8.3 

8.3 
8.3 
8.3 
8.3 
8.3 



Bleed 1 


Bleed 2 


2nd 


2 Wke 1 


I mm ' n 


pose 2nd 1 


45 


309 | 


254 


460 


O.J 


93 J 


43 


A tr 1 

45 1 


8 3 


274 1 


47 


1502 J 


80 


5776 


fiQ 


3440 1 


8 . 3 


1 \A "7 \ 

o J4 / 1 


69 


lion 1 
XI/ / 1 


o J 


102 1 




662 1 




459 I 


58 


48 1 


95 


355 


110 


9074 j 


8.3 


4897 


49 


4089 


59 


5280 


8.3 


929 




653 


87 


22675 


76 


3869 




1004 


71 


7080 
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Table 1<? 




Treatment 


Bleed 
0 


Bleed 1 


Bleed 2 




1st 

Immunization 


2nd 

Immunization 


1st 
Imm'n 


2nd 
Imm'n 


2 WKG 

post 2nd 


EO 481 






8.3 


6.3 


8.3 


EO 482 






6.3 


8.3 


8.3 


£0 483 


Sindbis/Env 


(None) 


6.3 


78 


103 


EO 484 






8.3 


8.3 


32 


EO 485 






8.3 


76 


207 


EO 486 






8.3 


8.3 


458 


EO 487 






8.3 


8.3 


345 


EO 488 


Sindbis/Env 


Sindbis/Env 


8.3 


8.3 


331 


EO 489 






8.3 


103 


111 


EO 490 






8.3 


8.3 


5636 
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Lymphoprol iterative activity (LPA) was also 
determined by antigenic stimulation followed by uptake of 
3 H-thymidine in these animals and is shown in Table 20. 
Experiment 1 was performed at 14 weeks post third DNA 
immunization and Experiment 2 was performed at 2 weeks 
post fourth DNA immunization using DNA and protein. For 
gpl20ThalE, gpl20SF2 and US4 o-gpl40, appropriate 
background values were used to calculate Stimulation 
Indices (S.I.; Antigenic stimulation CPM/Background CPM) . 



10 



Table 20 



15 



20 



25 



S.I.; Calculated as Ag CPM/Backaronnrl C PM 



Animal/ 
exp# 



J408/#l 



H445/#l 



J408/#2 



H445/#2 



gpl20Thai 
E 



0 



gpl20 SF2 



0 



env2-3SF2 



o- 

gpl40US4 



As can be seen by the results presented in Table 20 
lymphoproliferative responses to o-gpl40.US4 antigen were 
also in all four animals at both experimental time 
points. Such proliferation results are indicative of 
induction of T-helper cell functions. 

The results presented above demonstrate that the 
synthetic gp!4 0 .modSF162 .delV2 DNA and protein are 
immunogenic in non-human primates. 
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25 



30 



Example 13 

In vitro expression of r e comhinant Sindhis RNA anH 
DNA containing the svnt- hetic Rao or Env expression 
cassettes 

5 A. Synthetic Ga g expression cassettes 

To evaluate the expression efficiency of the 
synthetic Gag expression cassette in Alphavirus vectors, 
the synthetic Gag expression cassette was subcloned into 
both plasmid DNA-based and recombinant vector particle- 
10 based Sindbis virus vectors. Specifically, a cDNA vector 
construct for in vitro transcription of Sindbis virus RNA 
vector replicons (pRSIN-luc; Dubensky, et al . , j Virol. 
70:508-519, 1996) was modified to contain a Pmel site for 
plasmid linearization and a polylinker for insertion of 
heterologous genes. A polylinker was generated using, two 
oligonucleotides that contain the sites Jfhol , Pmll, A pal , 
Narl, Xbal, and NotI (XPANXNF, SEQ ID NO: 17, and XPANXNR, 
SEQ ID NO: 18) . 

The plasmid pRSIN-luc (Dubensky et al . , supra) was 
digested with Xhol and NotI to remove the luciferase gene 
insert, blunt-ended using Klenow and dNTPs, and purified 
from an. agarose get using GeneCleanll (BiolOl, Vista, 
CA) . The oligonucleotides were annealed to each other 
and ligated into the plasmid. The resulting construct 
was digested with NotI and Sad to remove the minimal 
Sindbis 3 '-end sequence and A 40 tract, and ligated with an 
approximately 0.4 kbp fragment from PKSSIN1-BV (WO 
97/38087). This 0.4 kbp fragment was obtained by 
digestion of pKSSINl-BV with NotI and Sad, and 
purification after size fractionation from an agarose 
gel. The fragment contained the complete Sindbis virus 
3 '-end, an A 40 tract and a Pmel site for linearization. 
This new vector construct was designated SINBVE. 
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The synthetic HIV Gag coding sequence was obtained 
from the parental plasmid by digestion with EcoRl, blunt - 
ending with Klenow and dNTPs, purification with 
GeneCleanll, digestion with Sail, size fractionation on 
an agarose gel, and purification from the agarose gel 
using GeneCleanll. The synthetic Gag coding fragment was 
ligated into the SINBVE vector that had been digested 
with Xhol and Pmll . The resulting vector was purified 
using GeneCleanll and designated SINBVGag. Vector RNA 
replicons may be transcribed in vitro (Dubensky et al., 
supra; from SINBVGag and used directly for transfection 
of cells. Alternatively, the replicons may be packaged 
into recombinant vector particles by co-transf ection with 
defective helper RNAs or using an alphavirus packaging 
cell line as described, for example, in U.S. Patent 
Numbers 5,843,723 and 5,789.245, and then administered in 
vivo as described. . 

The DNA-based Sindbis virus vector pDCMVSIN-beta-gal 
(Dubensky, et al . , J Virol . 70:508-519, 1996) was 
digested with Sail and Xbal , to remove the beta- 
galactosidase gene insert, and purified using GeneCleanll 
after agarose gel size fractionation. The HIV Gag gene 
was inserted into the the pDCMVSIN-beta-gal by digestion 
of SINBVGag with Sail and Xhol , purification using 
GeneCleanll of the Gag-containing fragment after agarose 
gel size fractionation, and ligation. The resulting 
construct was designated pDSIN-Gag, and may be used 
directly for in vivo administration or formulated using 
any of the methods described herein. 

BHK and 293 cells were transfected with recombinant 
Sindbis vector RNA and DNA, respectively. The 
supernatants and cell lysates were tested with the 
Coulter p24 capture ELISA (Example 2) . 
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BHK cells were transfected by electroporation with 
recombinant Sindbis RNA. The expression of p24 (in 
ng/ml) is presented in Table 21. In the table, SINGag#l 
and 2 represent duplicate measurements, and SINpgal 
represents a negative control. Supernatants and lysates 
were collected 24h post transf ection. 



Table 21 



Construct 


Supernatant 


Lysate 


SINpgal RNA 


0 


0 


SINGag#l RNA 


7 ng 


Max (approx. 1 /ig) 


SINGag#2 RNA 


1 ng 


700 ng 



293 cells were transfected using LT-1 (Example 2) 
with recombinant Sindbis DNA. Synthetic 
pCMVKM2GagMod . SF2 was used as a positive control. 
Supernatants and lysates were collected 48h post 
transf ection. The expression of p24 (in ng/ml) is 
presented in Table 22. 



Table 22 



Construct 


Supernatant 


Lysate 


SINGag DNA 


3 


30 


pCMVKM2 . GagMod . SF2 
DNA 


32 


42 



The results presented in Tables 21 and 22 
demonstrate that Gag proteins can be efficiently 
expressed from both DNA and RNA-based Sindbis vector 
systems using the synthetic Gag expression cassette 
(p55Gag.mod) . 

B. Synt hetic Env expression cassettes 

To evaluate the expression efficiency of the 
synthetic Env expression cassette in Alphavirus vectors, 
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synthetic Env expression cassettes were subcloned into 
both plasmid DNA-based and recombinant vector particle- 
based Sindbis virus vectors as described above for Gag 

The synthetic HIV Env coding sequence was obtained 
from the parental plasmid by digestion with Sail and 
Xbal. size fractionation on an agarose gel, and 
Purification from the agarose gel using GeneCleanli The 
synthetic Env coding fragment was ligated into the SINBVE 
vector that had been digested with xtol and Xbal The 
resulting vector was purified using GeneCleanli and 
designated SINBVEnv. Vector RNA replicons may be 
transcribed in vitro (Dubensky et al , supr&) ^ 
SINBVEnv and used directly for transection of cells 
Alternatively, the replicons may be packaged into 
recombinant vector particles by co-transf ection with 
defective helper RNAs or using an alphavirus packaging 
cell l lne and administered as described above for Gag 

The DNA-based Sindbis virus vector pDCMVSIN-beta-gal 
(Dubensky, et al . , j Virol . 70:5 o 8 -519, 1996) was 
digested with Sail and Xbal, to remove the beta- 
galactosidase gene insert, and purified using GeneCleanli 
after agarose gel si 2e fractionation. The HIV Env gene 
was inserted into the the pDCMVSIN-beta-gal by digestion 
of SINBVEnv with Xbal and xtol. purification using 
GeneCleanli of the Env- containing fragment after agarose 
9el size fractionation, and ligation. The resulting 
construct was designated pDSIN-Env, and may be used 
directly for in vivo administration or formulated using 
any of the methods described herein 

BHK and 293 cells were transfected with recombinant 
Sindbis vector RNA and DNA, respectively The 
supernatants and cell l ysates were tested 
ELISA. apcure 
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BHK cells were transfected by .electroporation with 
recombinant .Sindbis RNA. The expression of Env (in 
ng/ml) is presented in Table 23. In the table, the 
Sindbis RNA containing synthetic Env expression cassettes 
are indicated and pgal represents a negative control. 
Supernatants and lysates were collected 24h post 
transfection. 



Construct 



Table 23 

Supernatant 
(Neat) ng/ml 



Lysate 

(1:10 dilution) ng/ml 



3gal RNA 



gpl4 0.modUS4 



0 



726 



0 



7147 



gpl40.modSF162 



3529 



7772 



gpl4 0 . modUS 4 . del VI /V2 



1738 



gpl4 0 . modUS4 . del V2 



960 



gpl4 0 . modSFl 62 . del V2 



2772 



6526 



3023 



3359 



293 cells were transfected using LT-1 mediated 
transfection (PanVera) with recombinant Sindbis DNA 
containing synthetic expression cassettes of the present 
invention and Pgal sequences as a negative control. 
Supernatants and lysates were collected 4 8h post 
transfection. The expression of Env (in ng/ml) is 
presented in Table 24. 
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Table 24 



Construct 




Pgal 

gp!4 0 .mod SF162 ,delV2 
gpl4 0.modSF162 



The results presented in Tables 23 and 24 
demonstrated that Env proteins can be efficiently 
expressed from both DNA and RNA-based Sindbis vector 
systems using the synthetic Env expression cassettes of 
the present invention. 



Example* 14 

A- In vivo Immunisation with n w »-^. i nAnQ nTJa 
Sindbi s parfirlpc 

CB6F1 mice were immunized intramuscularly at 0 and 4 
weeks with plasmid DNA and/or Sindbis vector RNA-. 
containing particles each containing GagMod.SF2 sequences 
as indicated in Table 25. Animals were challenged with 
recombinant vaccinia expressing SF2 Gag at 3 weeks post 
second immunization (at week 7) . Spleens were removed 
from the immunized and challenged animals 5 days later 
for a standard "c release assay for CTL activity. Values 
shown in Table 25 indicate the results from the spleens 
of three mice from each group. The boxed values in Table 
25 xndicate that all groups of mice receiving 
immunizations with pCMVKm2.GagMod.SF2 DNA and/or 
SindbisGag M od.SF2 virus partacles either alone or in 
combinations showed antigen-specific CTL activity 
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Table 25 



10 



15. 



20 



25 



30 




a 20 /xg 

b 10 7 particles 

jfars r^srss^stt^xs-, 1 ?' HIV - 

-i o edBe assay. Values seen represent results 
from 3 pooled mouse spleens per group results 

^ In Viv ° Tnrninnizar.ion with E^v-P^n^ a iniI]a nM & ^^. /^ 

Sindbi s partirlp g 
Balb/C mice were immunized intramuscularly at 0 and 
4 weeks (as shown in the following table) with plasmid DNA 
and/or Sindbis-virus RNA- containing particles each 
containing gpl20.modUS4 sequences. Treatment regimes and 
antibody titers are shown in Table 26. Antibody titers 
were determined by ELISA using g P i 2 0 SF2 protein to coat 
the plates. 
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Table 26 I 




[ Treatment 




Bleed 
0 


I Bleed 1 
I (8 wks) 


Bleed 2 1 
| (10 wks) 1 


1 Animal 


1st 

I Immunization 


2nd j 
1 Immunization 


1st 
Imm' n 


2nd 
Imm'n 


2 Wks 
post 2nd J 


1 EO 4 56 
EO 457 
EO 458 
EO 459"' 
EO 460 


25//g 12 0mod 
DNA 


( (None) 


8.3 
8.3 
8.3 
8.3 
8.3 


45 
254 
8.3 

43 
8.3 


309 I 
460 

93 

45 
274 


EO 461 
EO 462 
EO 4 63 
EO 464 
EO 465 


25/ig 120mod 
DNA 


1 25/ig 120mod 
DNA 


8.3 
8.3 
8.3 
8.3 
8.3 


47 
80 

89 J 

8.3 
69 


1502 
5776 
3440 
3347 
1127 



EO 467 
E0 468 
EO 4 69 
EO 4 70 



50/ig 12 0mod 
DNA 



(None) 



EO 


471 


EO 


472 


EO 


473 


1 EO 


474 


EO 


475 


EO 


476 


EO 


477 


EO 


478 


EO 


479 


EO 


480 


EO 


481 


EO 


482 


EO 


483 


EO 


484 


EO 


485 


EO 


486 


EO 


487 


EO 


488 


EO 


489 


EO 


490 



50/zg 12 0mod 
DNA 



50//g 120mod 
DNA 



25/ig 12 0mod 
DNA 



Sindbis/Env 



8.3 
8.3 
8.3 
8.3 
8.3 

8.3 
8.3 
8.3 
8.3 
8.3 

8.3 
8.3 
8.3 
8.3 
8.3 



63 
112 
94 
58 
95 



Sindbis/Env 



(None) 



Sindbis/Env I Sindbis/Env 



8.3 
8.3 
8.3 
8.3 
8.3 

8.3 
8.3 
8.3 
8.3 
8.3 



102 
662 
459 
48 
355 



110 


9074 


8.3 


4897 


4 9 


[ 4089 


59 


5280 


8.3 


929 




653 


87 


22675 


76 


3869 




1004 


71 


7080 


8.3 


8.3 


8.3 


8.3 


78 


103 


8.3 


32 


76 


207 


8.3 


458 


8.3 


345 


8.3 


331 


103 


111 


8.3 


5636 



As can be seen from the data presented above, all of 
the mice generally demonstrated substantial immunological 
responses by bleed number 2. For Env, the best results 
were obtained using either (i) 50 „g of gpl20.modUS4 DNA 
for the first immunization followed by a second 
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immunization using 50 /ig of gpl20.mbdUS4 DNA, or (ii) 2 5 

of gpl20.modUS4 DNA for the first immunization 
followed by, a second immunization using 10 1 pfus of 
Sindbis . 

The results presented above demonstrate that the Env 
and Gag proteins of the present invention are effective 
to induce an immune response using Sindbis vector systems 
which include the synthetic Env (e.g., gpl20 .modUS4) or 
Gag expression cassettes. 

Example 15 

Co-Transfection of Env and Gag as Monocistronic and 

Bicistronic Constructs 
DNA constructs encoding (i) wild-type US 4 and SF162 
Env polypeptides, (ii) synthetic US 4 and SF162 Env 
polypeptides (gpl60 .modUS4 , gpl60 .modUS4 . delVl/V2 , 
gpl60 ,modSF162 l and gpl20 .modSF162 .delV2) , and (iii) 
SF2gag polypeptide (i.e., the Gag coding sequences 
obtained from the SF2 variant or optimized sequences 
corresponding to the gagSF2 -- gag.modSF2) were prepared. 
These monocistronic constructs were co- transf ected into 
293T cells in a transient transfection protocol using the 
following combinations: gpl60 . modUS4 ,- gpl60.modUS4 and 
gag.modSF2; gpl60 .modUS4 .delVl/V2 ; gpl60 .modUS4 .delVl/V2 
and gag.modSF2; gp!60 .modSF162 and gag.modSF2; 
gp!20 .modSF162 .delV2 and gag.modSF2; and gag.modSF2 
alone . 

Further several bicistronic constructs were made 
where the coding sequences for Env and Gag were under the 
control of a single CMV promoter and, between the two 
coding sequences, an IRES (internal ribosome entry site 
(EMCV IRES); Kozak, M. , Critical Reviews in Biochemistry 
and Molecular Biology 27 (45) : 385-402 , 1992; Witherell, 
G.W., et al., Virology 214:660-663, 1995) sequence was 
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introduced after the Env coding sequence and before the 
Gag coding sequence. Those constructs were as follows: 
gpl60.modUS4.gag.modSF2, SEQ IDNO.-73 (Figure 61); 
gpl60.modUSF162.gag.modSF2, SEQ IDNO:74 (Figure 62)- 
gpl60.modUS4.delVl/V2.gag.modSF2, SEQ IDN0:75 (Figure 

63); and gP 160.modSF162.delV2.gag.modSF2, SEQ IDNO:76 

(Figure 64) . 

Supernatants from cell culture were filtered through 
0.45 M m filters then ultracentrif uged for 2 hours at 
24,000 rpm (140 ( 0O0Xg) in an SW28 rotor through a 20% 
sucrose cushion. The pelleted materials were suspended 
and layered on a 20-60% sucrose gradient and spun for 2 
hours at 40,000 rpm (285,000Xg) in an SW41T1 rotor 
Gradients were fractionated into 1.0,1 samples. A total 
of 9-10 fractions were typically collected from each DNA 
transfection group. 

The fractions were tested for the presence of the 
■ Env and Gag proteins (across all fractions) . These 
results demonstrated that the appropriate proteins were 
expressed in the transacted cells (i.e., if an Env 
coding sequence was present the corresponding Env protein 
was detected; if a Gag coding sequence was present the 
corresponding Gag protein was detected) . 

Virus like particles (VLPs) were known to be present 
through a selected range of sucrose- densities. Chimeric 
Virus like particles (VLPs) were formed using all the 
tested combinations of constructs containing both Env and 
Gag. Significantly more protein was found in the 
supernatant collected from the cells transacted with 
«gpl60.modUS4.delVl/V2 and gag.modSF2« than in all the 
other supernatants. 

Western blot analysis was also performed on sucrose 
gradient fractions from each transfection. The results 
show that bicistronic plasmids gave lower amounts of VLPs 
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10 



than the amounts obtained using co- transf ection with 
monocistronic plasmids. 

In order to verify the production of chimeric VLPs 
by these cell lines the following electron microscopic 
analysis was carried out. 

293T cells were plated at a density of 60-70% 
confluence in 100 mm dishes on the day before 
transf ection. The cells were transf ected with 10 fig of 
DNA in transfection reagent LT1 (Panvera Corporation, 545 
Science Dr. , Madison, WI) . The cells were incubated 
overnight in reduced serum* medium (opt i -MEM, Gibco-BRL, 
Gaithersburg, MD) . The medium was replaced with 10% 
fetal calf serum, 2% glutamine in IMDM in the. morning of 
the next day and the cells were incubated for 65 hours. 
15 Supernatants and lysates were collected for analysis as 
described above (see Example 2) . 

The fixed, transfected 293T cells and purified ENV- 
GAG VLPs were analyzed by electron microscopy. The cells 
were fixed as follows. Cell monolayers were washed twice 
with PBS and fixed with 2% glutaraldehyde . For purified 
VLPs, gradient peak fractions were collected and 
concentrated by ultracentrif ugation (24,000 rpm) for 2 
hours. Electron microscopic analysis was performed by 
Prof. T.S. Benedict Yen (Veterans Affairs, Medical 
25 Center, San Francisco, CA) . 

Electron microscopy was carried out using a 
transmission electron microscope (Zeiss 10c) . The cells 
were pre-stained with osmium and stained with uranium 
acetate and lead citrate. Immunostaining was performed 
to visualize envelope on the VLP. The magnification was 
100, 000X. 

Figures 65A-65F show micrographs of 293T cells 
transfected with the following constructs: Figure 65A, 
gag.modSF2; Figure 65B, gplSO . modUS4 ; Figure 65C, 
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g P 160.modUS4.delVl/V2.gag.modSF2 (bicistronic Env and. 
Gag); Figures 65D and 65E, gpl60 . modUS4 . delVl/V2 and 
gag.modSF2; and Figure 65F, gpl20 . modSF162 . delV2 and 
gag.tnodSF2. In the figures, free and budding immature 
virus-like-particles (VLPs) of the expected size 
(approximately 100 nra) decorated with the Env protein 
were seen. In sum, gpieo polypeptides incorporate into 
Gag VLPs when constructs were co-transf ected into cells. 
The efficiency of incorporation is 2-3 fold higher when 
constructs encoding V-deleted Env polypeptides from high 
synthetic expression cassettes are used. 

Although preferred embodiments of the subject 
invention have been described in some detail, it is 
understood that obvious variations can be made without 
departing from the spirit and the scope of the invention 
as defined by the appended claims. 
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What Is Claimed Is : 

1. An expression cassette, comprising 

a polynucleotide sequence encoding a polypeptide 
including an HIV Gag polypeptide, wherein the 
polynucleotide sequence encoding said Gag polypeptide 
comprises a sequence having at least 90% sequence 
identity to the sequence presented as SEQ ID N0:20. 

2. The expression cassette of claim 1, comprising, 
a polynucleotide sequence encoding a polypeptide 

including an HIV Gag polypeptide, wherein the 
polynucleotide sequence encoding said Gag polypeptide 
comprises a sequence having at least 90% sequence 
identity to the sequence presented as SEQ ID NO: 9. 

3. The expression cassette of claim 1, wherein said 
polynucleotide sequence encoding a polypeptide including 
an HIV Gag polypeptide comprises a sequence having at 
least 90% sequence identity to the sequence presented as 
SEQ ID NO: 4. 

. 4. The expression cassette of claim 1, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV protease polypeptide. 

5. The expression cassette of claim 4, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to a 
sequence selected from the group consisting of: SEQ ID 
NO: 5, SEQ ID NO: 78, and SEQ ID NO: 79. 

6. The expression cassette of claim 1, wherein said 
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polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV reverse transcriptase 
polypeptide. 

5 7. The expression cassette of claim 6, wherein the 

nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to a 
sequence selected from the group consisting of: SE Q ID 
NO:80, SEQ ID N0:81, SEQ ID NO:82, SEQ ID NO.-83, and SEQ 
10 ID NO: 84. 

8. The expression cassette of claim l, wherein said 
polynucleotide sequence further includes a polynucleotide 
sequence encoding an HIV tat polypeptide. 

9. The expression cassette of claim 8, wherein the 
nucleotide sequence encoding said polypeptide comprises a 
sequence having at least 90% sequence identity to a 
sequence selected from the group consisting of: SEQ ID 
NO:86, SEQ ID NO:87, SEQ ID NO:88 and SEQ ID NO:89. 

10. The expression cassette of claim i, wherein 
said polynucleotide sequence further includes a 
polynucleotide sequence encoding an HIV polymerase 
polypeptide, wherein the nucleotide sequence encoding 
said polypeptide comprises a sequence having at least 90% 
sequence identity- to the sequence presented as SEQ ID 
NO: 6. 



15 
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30 11. The expression cassette of claim 1, wherein 

said polynucleotide sequence further includes 



;s a 

ise 



polynucleotide sequence encoding an HIV polymer 
polypeptide, wherein (i) the nucleotide seguence en 
said polypeptide comprises a sequence having at least 90% 
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10 



15 



sequence identity to the sequence presented as SEQ ID 
NO:4, and (ii) wherein the sequence is modified by 
deletions of coding regions corresponding to reverse 
transcriptase and integrase . 

12. The expression cassette of claim 11, wherein 
said polynucleotide sequence preserves T-helper cell and 
CTL epitopes. 

13. The expression cassette of claim 1, wherein 
said polynucleotide sequence further includes a 
polynucleotide sequence encoding an HCV core polypeptide, 
wherein the nucleotide sequence encoding said polypeptide 
comprises a sequence having at least 90% sequence 
identity to the sequence presented as SEQ ID NO: 7. 

14. An expression cassette, comprising a 
polynucleotide sequence encoding a polypeptide including 
an HIV Env polypeptide, wherein the polynucleotide 
sequence encoding said Env polypeptide comprises a 
sequence having at least 90% sequence identity to SEQ ID 
N0:71 (Figure 58) or SEQ ID NO:72 (Figure 59). 

15. The expression cassette of claim 14, wherein 
25 said Env polypeptide includes sequences flanking a VI 

region but has a deletion in the VI region itself. . 

16. The expression cassette of claim 15, wherein 

the polynucleotide sequence encoding the polypeptide 

30 comprises the sequence presented as SEQ ID NO: 65 (Figure 
52 gpl60 .modUS4 .del VI) . 

17. The expression cassette of claim 14, wherein 
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said Env polypeptide includes sequences flanking a V2 
region but has a deletion in the V2 region itself. 

18. The expression cassette of claim 17, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO: 60 
(Figure 47); and SEQ ID NO:66 (Figure 53). 



19. 



The expression cassette of claim 17, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO- 34 
(Figure 20); SEQ ID NO:37 (Figure 24); SEQ ID NO-40 
(Figure 27); SEQ ID NO.-43 (Figure 30); SEQ ID NO-46 
(Figure 33); SEQ ID N0.-49 (Figure 36); and SEQ ID NO-76 
(Figure 64) . 

20. The expression cassette of claim 14, wherein 
sa ld Env polypeptide includes sequences flanking a V1/V2 
region but has a deletion in the V1/V2 region itself. 

21. The expression cassette of claim 20, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO: 59 
(Figure 46); SEQ ID N0:61 (Figure 48); SEQ ID NO:67 
(Figure 54); and SEQ ID NO:75 (Figure 63). 

22. The expression cassette of claim 20, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO-35 
(Figure 21); SEQ ID NO:38 (Figure 25); SEQ ID NO-41 
(Figure 28) ; SEQ ID NO:44 (Figure 31); SEQ ID NO-47 
(Figure 34) and SEQ ID NO:50 (Figure 37) 
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23. The expression cassette of claim 14, wherein 
said Env polypeptide has a mutated cleavage site that 
prevents the cleavage of a gpl4 0 polypeptide into a gpl2 0 
polypeptide and a gp41 polypeptide. 

24. The expression cassette of claim 23 , wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of : SEQ ID NO:57 
(Figure 44); SEQ ID NO:61 (Figure 48); and SEQ ID NO:63 
(Figure 50) . 

25. The expression cassette of claim 23, wherein 
the polynucleotide . sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO: 39 
(Figure 26); SEQ ID NO:40 (Figure 27); SEQ ID NO:41 
(Figure 28); SEQ ID NO:42 (Figure 29); SEQ ID NO:43 
(Figure 30); SEQ ID NO:44 (Figure 31); SEQ ID NO:45 
(Figure 32); SEQ ID NO:46 (Figure 33); and SEQ ID NO:47 
(Figure 34) . 

26. The expression cassette of claim 14, wherein 
said Env polypeptide includes a gpl60 Env polypeptide or 
a polypeptide derived from a gpl60 Env polypeptide. 

27. The expression cassette of claim 26, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO: 64 
(Figure 51) ; SEQ ID NO:65 (Figure 52); SEQ ID NO:66 
(Figure 53); -SEQ ID NO: 67 (Figure 54); SEQ ID NO:68 
(Figure 55); SEQ ID NO:75 (Figure 63); and SEQ ID NO:73 
(Figure 61) . 

28. The expression cassette of claim 26, wherein 
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the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO:48 
(Figure 35); SEQ ID NO:49 (Figure 36); SEQ ID NO.-50 
(Figure 37); SEQ ID NO:76 (Figure 64); and SEQ ID NO:74 
(Figure 62) . 



29. The expression cassette of claim 14, wherein 
saad Env polypeptide includes a gpi40 Env polypeptide or 
a polypeptide derived from a gpi 4 0 Env polypeptide. 

30. The expression cassette of claim 29, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID no ■ 56 
(Figure 43); SEQ ID NO:57 (Figure 44); SEQ ID NO-58 
(Figure 45); SEQ ID N0:59 (Figure 46); SEQ ID NO-60 
(Figure 47); SEQ ID N0:61 (Figure 48); SEQ id NO:62 
(Figure 49); and SEQ ID NO:63 (Figure 50). 

31. The expression cassette of claim 29, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO-36 
(Figure 23) , SEQ ID NO:3 7 (Figure 24); SEQ ID NO-38 
(Figure 25); SEQ ID NO.-39 (Fi gure 26) ; SEQ JD 
(Figure 27); SEQ ID NO:41 (Figure 28) ; SEQ ID NO-42 
(Figure 29) , SEQ ID NO:43 (Figure 30); SEQ ID NO-44 
(Figure 31); SEQ ID NO:45 (Figure 32); SEQ ID NO:46 
(Figure 33); and SEQ ID NO:47 (Figure 34). 

32. The expression cassette of claim 14, wherein 
saxd Env polypeptide includes a gpl20 Env polypeptide or 
a Polypeptide derived from a gpi 20 Env polypeptide 
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33. The expression cassette of claim 32, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO:54 
(Figure 41) ; and SEQ ID NO: 55 (Figure 42) . 

5 

34. The expression cassette of claim 32, wherein 
the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO: 33 
(Figure 19); SEQ ID NO:34 (Figure 20); and SEQ ID NO:35 

10 (Figure 21) . 

35. The expression cassette of claim 14, wherein 

. the polynucleotide sequence encoding the polypeptide is 
selected from the group consisting of: SEQ ID NO: 55 
15 (Figure 42); SEQ ID NO:62 (Figure 4 9); SEQ ID NO:63 
(Figure 50); and SEQ ID NO:68 (Figure 55). 

36. A recombinant expression system for use in a 
selected host cell, comprising, an expression cassette of 

20 any of claims 1-35, and wherein said polynucleotide 
sequence is operably linked to control elements 
compatible with expression in the selected host cell. 

37. The recombinant expression system of claim 36, 
25 wherein said control elements are selected from the group 

consisting of a transcription promoter, a transcription 
enhancer element, a transcription termination signal, 
polyadenylation sequences, sequences for optimization of 
initiation of translation, and translation termination 
3 0 sequences. 

38. The recombinant expression system of claim 36, 
wherein said transcription promoter is selected from the 
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group consisting of CMV, CMV.intron A, SV40, RSV HIV- 
Ltr, MMLV-ltr , and metallothionein . 

39. A cell comprising an expression cassette of any 
5 of claims 1-35, and wherein said polynucleotide sequence 
x. operably linked to control elements compatible with 
expression in the selected cell. 

40. The cell of claim 39, wherein the cell i s a 
10 mammalian cell. 

41. The cell of claim 40, wherein the cell is 
selected from the group consisting of BHK, VERO, HT1080 
293, RD, COS-7, and CHO cells. 



15 



42. The cell of claim 41, wherein said cell is a 
CHO cell. 



43. The cell of claim 39, wherein the cell is an 
20 insect cell. 



44. The cell of claim 43, wherein the cell is 
either Trichoplusia ni (Tn 5 ) or Sf9 insect cells. 



25 



30 



45. The cell of claim 39, wherein the cell is 
bacterial cell. 



46. The cell of claim 39, wherein the cell is a 
yeast cell. 



47. The cell of claim 39, wherein the cell is a 
plant cell. 
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48. The cell of claim 39 # wherein the cell is an 
antigen presenting cell. 

49. The cell of claim 48, wherein the lymphoid cell 
is selected from the group consisting of macrophage, 
monocytes, dendritic cells, B-cells, T-cells, stem cells, 
and progenitor cells thereof. 

50. The cell of claim 39, wherein the cell is a 
primary cell. 

51. The cell of claim 39, wherein the cell is an 
immortalized cell. 

52. The cell of claim 39, wherein the cell is a 
tumor-derived cell. 

53. A method for producing a polypeptide including 
HIV Gag polypeptide sequences, said method comprising, 

incubating the cells of claim 39, under conditions 
for producing said polypeptide. 

54. A method for producing virus-like particles 
(VLPs) , comprising, 

incubating the cells of claim 39, under conditions 
for producing said VLPs. 

55. A method for producing a composition of virus- 
like particles (VLPs) , comprising, 

(a) incubating the cells of claim 39, under 
conditions for producing said VLPs ; and 

(b) substantially purifying said VLPs to produce a 
composition of VLPs. 
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56. A cell line useful for packaging lentivirus 
vectors, comprising 

suitable host cells that have been transfected with 
an expression vector containing an expression cassette of 
5 any of claims 1-35, and wherein said polynucleotide 
sequence is operably linked to control elements 
compatible with expression in the host cell. 

57. The cell line of claim 56, wherein suitable 

10 host cells have been transfected with an expression 

vector containing the expression cassette of any of 
claims 1-13. 

58. The cell line of claim 56, wherein suitable 
15 host cells have been transfected with an expression 

vector containing the expression cassette of claim 1-3. 

59. The cell line of claim 56, wherein suitable 
host cells have been transfected with an expression 

20 vector containing the expression cassette of claim 14-35. 

60. A gene delivery vector for use in a Mammalian 
subj ect , comprising 
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30 



a suitable gene delivery vector for use in said 
subject, wherein the vector comprises an expression 
cassette of any of claims 1-35, and wherein said 
polynucleotide sequence is operably linked to control 
elements compatible with expression in the subject. 

61. A method of DNA immunization of a subject, 
comprising, 

introducing a gene delivery vector of claim 60 into 
saxd subject under conditions that are compatible with 
expression of said expression cassette in said subject 
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62. The method of claim 61, wherein said gene 
delivery vector is a nonviral vector. 

63. The method of claim 61, wherein said vector is 
delivered using a particulate carrier. 

64. The method of claim 63, wherein said vector is 
coated on a gold or tungsten particle and said coated 
particle is delivered to said subject using a gene gun. 

65. The method of claim 63, wherein said vector is 
encapsulated in a liposome preparation. 

66. The method of claim 61, wherein said vector is 
a viral vector. 

67. The method of claim 66, wherein said viral 
vector is a retroviral vector. 

68. The method of claim 67, wherein said viral 
vector is a lentiviral vector. 

69. The method of claim 61, wherein said subject is 
a mammal . 

70. The method of claim 69, wherein said mammal is 
a human. 

71. A method of generating an immune response in a 
subject , comprising 

transfecting cells of said subject a gene delivery 
vector of claim 60, under conditions that permit the 
expression of said polynucleotide and production of said 
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polypeptide, thereby eliciting an immunological response 
to said polypeptide. 

72. The method of claim 71, wherein said vector is 
a nonviral vector. 

73 . The method of claim 72, wherein said vector is 
delivered using a particulate carrier. 

74. The method of claim 73, wherein said vector is 
coated on a gold or tungsten particle and said coated 
particle is delivered to said vertebrate cell using a 
gene gun. 

75. The method of claim 73, wherein said vector is 
encapsulated in a liposome preparation. 

76. The method of claim 71, wherein said vector is 
a viral vector. 

77. The method of claim 76, wherein said viral 
vector is a retroviral vector. 

78. The method of claim 77, wherein said viral 
vector is a lentiviral vector. 

79. The method of claim 71, wherein said subject is 
a mammal. 



80. The method of claim 79, wherein said mammal is 
a human . 



81. The method of claim 71, wherein said 
transfecting is done ex vivo and said transfected cell; 
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are reintroduced into said subject. 

82. The method of claim 71, wherein said 
transfecting is done in vivo in said subject. 

83. The method of claim 71, where said immune 
response is a humoral immune response. 

84. The method of claim 71, where said immune 
response is a cellular immune response. 

85. A gene delivery vector comprising an alphavirus 
vector construct, wherein said alphavirus construct 
comprises an expression cassette according to any one of 
claims 1 through 35. 

86. The gene delivery vector of claim 85, wherein 
the alphavirus vector construct is a cDNA vector 
construct. 

87. The gene delivery vector of claim 85, wherein 
the alphavirus comprises a recombinant alphavirus 
particle preparation. 

88. The gene delivery vector of claim 85, wherein 
the vector comprises a eukaryotic layered vector 
initiation system. 

89. A method of stimulating an immune response in a 
subject comprising administering the gene delivery vector 
of any one of claims 85 through 88 in an amount effective 
to stimulate an immune response in said subject. 

90. The method of claim 89, wherein the gene 
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delivery vector is administered intramuscularly, 
intramucosally, intranasally, subcutaneously, 
intradermally, transdermall , intravaginally, 
intrarectally, orally or intravenously. 
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1 / 131 



orig.gagSF2 

ATGGGTGCGAGAGCGTCGGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCG^^ 



a* aaaata$ia§& aaaacatat; gtatgggcaagcagggagctagaacgattcgcagtcaatcctggcctgttagaa 

G G C C G C C 



Inact.2 

ACATCAGAAGGCTGCAGACAAATATTGGGACAGCTACAGCCATCCCTTCAGACAGGATCAG^GAACTTAGA 

G G C C 



TnactTT 



T^TAAl fcCAGTAGCAACCCTCTATTGTGTACA rCAAAGGATAGATGTAAAJ GACACCAAGGAAGCTTTAGAGAAGATA 
q \_ GC C C G 



GAGGAAGAGCAAAACAA AAGTAAGAAA&GGCACAGCAfl GCAGCAGCTGCAGCTGGCAC AGGAAACAGCAGCCAGGTC 

GTCC G C G 



AGCCAAAATTACCCTATAGTGCAGAACCTACAGGGGCAAATGGTACATCAGGCCATATCACCTAGAACTTTAAATGCA 
TGGGTAAAAGTAGTAGAAGAAAAGGCTTTCAGCCCAGAAGTAATACCCATGTTTTCAGCATTATCAGAAGGAGCCACC 



ccac; agatttaaacaccatgctaaacac^Jgtggggggacatcaagca^ 

G CC G G T G C 



GAGGAAGCTGCAGAATGGGATAGAGTGCATCCAGTGCATGCAGGGCCTATTGCACCAGGCCAAATGAGAGAACCAAGG 
GGAAGTGACATAGCAGGAACTACTAGTACCCTTCAGGAACAAATAGGATGGATGACAAATAATCCACCTATCCCAGTA 



ggag^tctata^ 



G G 



Inact.7 



G C G C G G 



ATAAGACAAGGACCAAAGGAACCCTTTAGAGATTATGTAGACCGGTTCTATAAAACTCTAAGAGC 



2™ 



, Ina ct.T" 

CAGGATGTAAAAAATTGGATGACAGAAACCTTGTTGGTCCAAAATGCAAACCCAGATTGTAAGAC TATTTTAAAAGCA 

C CC G G T 



CAAGCTTCA 



TTGGGA ZCAGCAGCTACACTAGAAGAAATGATGACAGCATGTCAGGGAGTGGGGGGACCCGGCCATAAAGCAAGAGTT 

c c c 



TTGGCTGAAGCCATGAGCCAAGTAACAAATCCAGCTAACATAATGATGCAGAGAGGCAATTTTAG^ 
ACTGTTAAGTGTTTCAATTGTGGCAAAGAAGGGCACATAGCCAAAAATTGGAGGGC^ 

AGATGTGGAAGGGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTAGGGAAGATCTGGCCT 

TACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGG 

GAGGAGAAAACAACTCCCTCTCAGAAGCAGGAGGCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTC 



TTTGGCAACGACCCCTCGTCACAATAA 
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HIV-1SF2 optimized Gag-DNA-sequence 
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gpl60wtSF162 

GTAGAAAAATTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAGCAACCACCACTCTATTTT 
GTGCATCAGATGCTAAAGCCTATGACACAGAGGTACATAATGTCTGGGCCACACATGCCTGTGTACCCAC 
AGACCCTAACCCACAAGAAATAGTATTGGAAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACATG 
GTAGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGTCTAAAGCCATGTGTAAAGTTAACCC 
CACTCTGTGTTACTCTACATTGCACTAATTTGAAGAATGCTACTAATACCAAGAGTAGTAATTGGAAAGA 
GATGGACAGAGGAGAAATAAAAAATTGCTCTTTCAAGGTCACCACAAGCATAAGAAATAAGATGCAGAAA 
GAATATGCACTTTTTTATAAACTTGATGTAGTACCAATAGATAATGATAATACAAGCTATAAATTGATAA 
ATTGTAAC AC CTC AGTC ATT AC AC AGG CCTGTCC AAAGGTATC CTTTG AAC CAATT C C CATACATT ATTG 
TGCCCCGGCTGGTTTTGCGATTCTAAAGTGTAATGATAAGAAGTTCAATGGATCAGGACCATGTACAAAT 
GTCAGCACAGTACAATGTACACATGGAATTAGGCCAGTAGTGTCAACTCAATTGCTGTTAAATGGCAGTC 
TAG CAGAAGAAGGGGTAGTAATTAG AT CTG AAAATTTCACAG ACAATG CTAAAACTATAATAGTACAG CT 
G AAGGAAT CTG TAGAAATTAATTG TACAAGAC CTAACAATAATACAAGAAAAAGTATAACTATAGGACCG 
GGGAGAGCATTTTATGCAACAGGAGACATAATAGGAGATATAAGACAAGCACATTGTAACATTAGTGGAG 
AAAAATGGAATAACACTTTAAAACAGATAGTTACAAAATTACAAGCACAATTTGGGAATAAAACAATAGT 
CTTTAAGCAATCCTCAGGAGGGGACCCAGAAATTGTAATGCACAGTTTTAATTGTGGAGGGGAATTTTTC 
TACTGTAATTCAAGACAGCTTTTTAATAGTACTTGGAATAATACTATAGGGCCAAATAACACTAATGGAA 
CTATCACACTCCCATGCAGAATAAAACAAATTATAAACAGGTGGCAGGAAGTAGGAAAAGCAATGTATGC 
CCCTCCCATCAGAGGACAAATTAGATGCTCATCAAATATTACAGGACTGCTATTAACAAGAGATGGTGGT 
AAAGAGATCAGTAACACCACCGAGATCTTCAGACCTGGAGGTGGAGATATGAGGGACAATTGGAGAAGTG 
AATTATATAAATATAAAGTAGTAAAAATTG AG CCATTAGG AGTAG CAC C C AC C AAGG CAAAG AGAAGAGT 
GGTGCAGAGAGAAAAAAGAGCAGTGACGCTAGGAGCTATGTTCCTTGGGTTCTTGGGAGCAGCAGGAAGC 
ACTATGGGCGCACGGTCACTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAGC 
AGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCACAGTCTGGGGCATCAAGCA 
GCTCCAGGCAAGAGTCCTGGCTGTGGAAAGATACCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGC 
TCTGGAAAACTCATTTGCACCACTGCTGTGCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGATCAGA 
TTTGGAATAACATGACCTGGATGGAGTGGGAGAGAGAAATTGACAATTACACAAACTTAATATACACCTT 
AATTGAAG AAT CG CAGAAC CAACAAG AAAAGAATGAACAAGAATTATTAGAATTGGAT AAGTGGGCAAGT 
TTGTGGAATTGGTTTGACATATCAAAATGGCTGTGGTATATAAAAATATTCATAATGATAGTAGGAGGTT 
TAGTAGGTTTAAGGATAGTTTTTACTGTGCTTTCTATAGTGAATAGAGTTAGGCAGGGATACTCACCATT 
ATC^TTTCAGACCCGCTTCCCAGCCCCAAGGGGACCCGACAGGCCCGAAGGAATCGAAGAAGAAGGTGGA 
GAGAGAGACAGAGACAGATCCAGTCCATTAGTGCATGGATTATTAGCACTCATCTGGGACGATCTACGGA 
G CCTGTG C CT CTTCAG CT ACC ACCG CTTG AG AG ACTTAATCTTG ATTG C AG CG AGG ATTGTGG AACTTCT 
GGGACGCAGGGGGTGGGAAGCCCTCAAGTATTGGGGGAATCTCCTGCAGTATTGGATTCAGGAACTAAAG 
AATAGTGCTGTTAGTTTGTTTGATGCCATAGCTATAGCAGTAGCTGAGGGGACAGATAGGATTATAGAAG 
TAGCACAAAGAATTGGTAGAGCTTTTCTCCACATACCTAGAAGAATAAGACAGGGCTTTGAAAGGGCTTT 

GCTATAA 

FIG. 18 
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gaattcgccaccatgaatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcacccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgnggaag 

gaggccaccaccaccccgtcccgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgcgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

aaccagaacctaaagccctgcgtgaagctgacccccctgcgcgtgaccccgcactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagacggaccgcggcgagaccaagaacngc 

agcttcaaaocaagcaccggcaagccgaccaaccgcaacaccagcg-garcacccaggcccgcccc 

aaggtgaaccccgagcccacccccatccactactgcgcccccgccggc-rcgccaccccgaagcgc 

aacgacaaaaagttcaacggcagcggcccccgcaccaacgtgagcaccgcgcagcgcacccacggc 

acccgccccctagtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgacc 

cgcagcgaaaacttcaccgacaacgccaagaccatcaccgtgcagccgaaggagagcgnggagatc 

aactgcacccaccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccctccac 

gccaccggcaacatcatcggcgacatccgccaggcccaccgcaacaccagcggcgagaagcggaac 

aacaccctgaagcaaaccgtgaccaagccgcaggcccagntcggcaacaagaccaccgtgttcaag 

cagagcaacoacggcgaccccgagatcgtgatgcacagctccaaccccggcggcgagcccttctac 

tgcaacagcacccaactgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 

accatcaccctaccctaccacaccaagcagaccatcaaccgctggcacgaggtgggcaaggccatg 

tacgccccccccatccgcagccagatccgctgcagcagcaacaccaccggcctgccgctgacccgc 

aacagcggcaaagagatcagcaacaccaccgagatcttccgccccggcggcggcgacatgcgcgac 

aactagcgcaacaagctgtacaagcacaaggtggtgaagaccgagccccrgggcgcggcccccacc 

aaggccaaccaccacgtagcgcagcgcgagaagcgcgccgcgaccc-gggcgccatgctcctgggc 

-ccctgagccccaccgacaacaccacgggcgcccgcagcctgacccccaccgcgcaggcccgccag 

ctgccgaacgacatcgcqcagcagcagaacaacctgccgcgcgcca-cgaggcccagcagcacctg 

ccgcagctaaccatgtggggcatcaagcagccgcaggcccgcgcgctggccgtggagcgctacctg 

aaggaccaacagctgctgggcatctggggctgcagcggcaagctgazczgcaccaccgccgtgccc 

tggaacaccagctggagcaacaagagcctggaccagatctggaacaacacgacctggatggagtgg 

aagcgcgaaatcgacaactacaccaacctgacctacaccccgatcgaggagagccagaaccagcag 

gagaagaacgagcaggagctgctggagctggacaagtgggccagcczg-ggaactggctcgacatc 

agcaagtggccgtggtacatccaactcgag 

FIG. 24 

(SEO ID NO:37) 



SURSTJTUTE SHEET (RULE 26) 



WO 00/39302 



36/131 



PCIYUS99/31245 



gpl40 .modSF162 . delVlV2 

gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagaEcgcgctggagaacgcgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 

agcgtgatcacccaggcctgccccaaggtgagcttcgagcccacccccacccactactgcgccccc 

gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 

agcaccgtgcagtgcacccacggcatccgccccgtggcgagcacccagccgctgctgaacggcagc 

ccggccgaggagggcgtggtgatccgcagcgagaacttcaccqacaacgccaagaccatcatcgtg 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcancacc 

atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgc 

aacatcagcggcgagaagcggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagctc 

ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagaccgtgatgcacagcttc 

aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 

accggccccaacaacaccaacggcaccaccaccctgccctgccgcaccaagcagatcstcaaccgc 

tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagauccgctgcagcagcaac 

atcaccggcctgctgctgacccgcgacggcggcaaggagaccagcaacaccaccgagacct tccgc 

cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggnggtgaagatc 

gagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 

accctgggcgccatgttcctgggcttccrgggcgccgccggcagcaccai.gggcgcccgcagccrg 

accctgaccgtgcaggcccgccagctgctgagcggcatcgcgcagcagcagaacaacccgctgcgc 

gccatcgaggcccagcagcacctgcrgcagctgaccgtgtggggcaLcaagcagccgcaggcccgc 

gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 

ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagat ccgg 

aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 

atcgaggagagccagaaccagcaggagaagaacgagcaggagctgcrggagctggacaagtgggcc 

agcctgtggaactggttcgacatcagcaagtggctgtggtacatctaacr cgag 

FIG. 25 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttccgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgcgg 

gaccagagcctgaagccctgcgtgaagcrgacccccctgtgcgtgaccctgcacrgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggtgaccaccagcar ccgcaacaagacgcagaaggagtacgccctgttctacaagctg 

gacgtggrgcccatcgacaacgacaacaccagctacaagcrgatcaactgcaacaccagcgtgatc 

acccaggcctgccccaaggtgagcttcgagcccar ccccatccactactgcgcccccgccggcttc 

gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 

cagtgcacccacggcatccgccccgtggtgagcacccagccgctgctgaacggcagcctggccgag 

gagggcgtggtgarccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 

gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 

ggccgcgccitctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 

ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 

accatcgtgttcaagcagagcagcggcggcgaccccgagaccgtgatgcacagcttcaactgcggc 

ggcgagttcttctactgcaacagcacccagctgrrcaacagcacctggaacaacaccaccggcccc 

aacaacaccaacggcaccaccacccrcccctgccgcatcaagcagatcat caaccgctggcaggag 

gtgggcaaggccatgtacgccccccccatccgcgcccagacccgctgcagcagcaacatcaccggc 

ctgccgctgacccgcgacggcggcaaggagatcagcaacaccaccgagat cttccgccccggcggc 

agcgacatgcgcgacaactggcgcagcgagctgi:acaagt£caaggcggcgaagat:cgagcccctg 

agcgtggcccccaccaaggccaagcgccgcgrggzgcaccgcgagaagagcgccgtgaccctgggc 

accatgttccrgggcttcctgggcgccgccggcaccaccacgggcgcccgcagcctgaccctigacc 

grgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccatcgag 

gcccagcagcacccgctgcagctgaccgcgcggggcatcaagcagccgcaggcccgcgtgctggcc 

gcggagcgctaccrgaaggaccagcagcrgccgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgcgccctggaacgccagctggagcaacaagagcctggaccagatctggaacaacatg 

acctggatggagtgggagcgcgagatcgacaacracaccaacctgacccacaccctgatcgaggag 

agccagaaccagcaggagaagaacgagcaggagc-gctggagctggacaagtgggccagccngtgg 

aactggttcgacatcagcaagtggctgtggtacatcxaactcgag 
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gaattcgccaccatggacgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgcggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccaaagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacaccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agctrcaaggtgggccccggcaagctgatcaaccgcaacaccagcgtgatcacccaggccugcccc 
aaggtgagcttcgagcccatccccatccactactgcgcccccgccggcrtcgccaccccgaagcgc 
aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgcgcagngcacccacggc 
atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggtgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcgcgcagctgaaggagagcgtggagatc 
aactgcacccgccccaacaacaacacccgcaagagcatcaccatcggccccggccgcgccctctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 
aacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaagaccatcgtgttcaag 
cagagcagcggcggcgaccccgagatcgtgacgcacagcttcaactgcggcggcgagttcttctac 
ngcaacagcacccagctgttcaacagcacctggaacaacaccatcggccccaacaacaccaacggc 
accatcaccctgccctgccgcatcaagcagatcatcaaccgccggcaggaggcgggcaaggccatg 
tacaccccccccatccgcggccagacccgctgcagcagcaacatcaccggcctgctgctgacccgc 
gacgacgacaaggagatcagcaacaccaccgagarcctccgccccggcggcggcgacatgcgcgac 
aactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctgggcgtggcccccacc 
aaaqccaagcgccgcgtggtgcagcgcgagaagagcgccgcgacccrgggcgccargrtcccgggc 
ttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccag 
ctgccgagcggcatcgtgcagcagcagaacaacctgctgcgcgccancgaggcccagcagcacctg 
ctgcagctgaccgtgtggggcatcaagcagcLgcaggcccgcgtgctggccgtggagcgccacctg 
aaggaccagcagctgctgggcatctggggctgcagcggcaagctgaccrgcaccaccgccgcgccc 
tggaacgccagctggagcaacaagagcctggaccagatctggaacaacatgacccggatggagtgg 
gagcacgagatcgacaaccacaccaacctgatCLacaccctgatcgaggagagccagaaccagcag 
gagaaaaacgagcaggagctgctggagctggacaagtgggc cage erg tggaac egg ttcgacatc 
agcaagtggccgtggtacatctaactcgag 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgcgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgrgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgrgaagcrgacccccctgtgcgcgggcgccggcaaccgccagacc 
agcgrgaccacccaggcctgccccaaggtgagccrcgagcccacccccatccactacugcgccccc 
gccggcttcgccaccctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 
agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 
ctggccgaggagggcgcggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 
cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 
atcggccccggccgcgcctrcuacgccaccggcgacatcatcggcgacarccgccaggcccactgc 
aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 
.ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagatcgrgacgcacagcttc 
aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 
atcggccccaacaacaccaacggcaccatcaccctgcccigccgcaccaagcagatcaccaaccgc 
cggcaggaggtgggcaaggccargcacgccccccccatccgcggccagauccgcrgcagcagcaac 
atcaccggcctgctgccgacccgcgacggcggcaaggagatcagcaacaccaccgagaccctccgc 
cccggcggcggcgacatgcgcgacaaccggcgcagcgagctguacaacracaaggtggtgaagatc 
aaacccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcaccgcgagaagagcgccgtg 
accctgggcgccacgttcctgggctrccrgggcgccgccggcagcarca-gcgcgcccgcagcctg 
accctgaccgrgcaggcccgccagctgcrgagcggcatcgLgcagcaccagaacaacctgctgcgc 
gccatcgaggcccagcagcaccrgcrgcagcrgaccgtgrggggca- caagcagctgcaggcccgc 
atgctggccgrggagcgctacctgaaggaccagcagctgctgggca- r-ggggctgcagcggcaag 
ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagccuggaccagatctgg 
aacaacatgacctggatggagtgggagcgcgagaccgacaactacaccaaccrgatccacaccctg 
at cgaggagagccagaaccagcaggagaagaacgagcaggagct get gcagccggacaagtgggcc 
agcctgcggaactggttcgacatcagcaagtggctgtggcacatcraacccgag 

FIG. 28 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagcngacccccctgtgcgtgaccctgcactgcaccaa cc tg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 

agcttcaaggcgaccaccagcacccgcaacaagatgcagaaggagtacgccctgttctacaagctg 

gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 

acccaggcctgccccaaggcgagcttcgagcccatccccatccactactgcgcccccgccggcttc 

gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 

cagtgcacccacggcarccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 

gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaag 

gagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcat caeca tcggcccc 

ggccgcgcctcctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 

ggcgagaagtggaacaacaccccgaagcagatcgtgaccaagccgcaggcccagttcggcaacaag 

accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 

ggcgagt tcttct act gcaacagcacccagctgttcaacagcacctgcaacaacacca tcggcccc 

aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcazcaaccgctggcaggag 

gtgggcaaggccatgi:acgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 

ctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatct tccgccccggcggc 

ggcgacatgcgcgacaacrggcgcagcgagctgtacaagtacaaggtggcgaagatcgagcccctg 

ggcgtggcccccaccaaggccatcagcagcgtggtgcagagcgagaagagcgccgtgaccctgggc 

gccatgttcctgggctccctgggcgccgccggcagcaccatgggcgcccgcagcctgaccccgacc 

gtgcaggcccgccagccgctgagcggcatcgtgcagcagcagaacaaccngctgcgcgccatcgag 

gcccagcagcaccteccgcagctgaccgtgtggggcatcaagcagctccaggcccgcgtgcnggcc 

gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 

accaccgccgtgccccggaacgccagctggagcaacaagagcctggaccagatcrggaacaacatg 

acctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgat cgaggag 

agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 

aactggttcgacatcagcaagtggctgcggtacatctaactcgag 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgt-ttcgcccagcgccgtggagaagctgtgggngaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagacgcacgaggacatcatcagcctgtgg 
aaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agctccaaggtgggcgccggcaagctgatcaacngcaacaccagcgtgatcacccaggcctgcccc 
aaggtgagcttcgagcccatccccatccaccactgcgcccccgccggctrcgccatcctgaagtgc 
aacgacaagaagttcaacggcagcggcccctgcaccaacgcgagcaccgngcagtgcacccacggc 
atccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgaggagggcgtggrgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaaggagagcgtggagatc 
aac tgcacccgccccaacaacaacacccgcaagagcat caeca tcggccccggccgcgccttctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 
aacaccctgaagcagatcgtgaccaagctgcagccccagtccggcaacaagaccatcgtgttcaag 
cagagcagcggcggcgaccccgagatcgtgatgcacagccrcaactgcggcggcgagttcttctac 
tgcaacagcacccagctcttcaacagcacctgaaacaacaccaccggccccaacaacaccaacggc 
accancaccctgccctgccgcatcaagcagatca-caaccgctggcaggaggtgggcaaggccatg 
tacgccccccccacccgcggccagatccgctgcaccagcaacatcaccggcctgctgctgacccgc 
gacggcggcaaggagarcagcaacaccaccgaga- ctrccgccccggcggcggcgacatgcgcgac 
aactggcgcagcgagctgcacaagtacaaggtgcrcaagancgagccccugggcgcggcccccacc 
aaggccatcagcagcgtggtgcagagcgagaagagcgccg-gaccctgggcgccatgttcctgggc 
t tcccgggcgccgccggcagcaccacgggcgcccccagcccgaccctgaccgtgcaggcccgccag 
ctgctgagcggcatcgtgcagcagcagaacaacccgctgcgcgccatcgaggcccagcagcacctg 
ctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggccgtggagcgctacctg 
aaggaccagcagctgctgggcatctggggctgcaccggcaagctgatctgcaccaccgccgtgccc 
tggaacgccagcrggagcaacaagagcctggaccagatcrggaacaacangacctggatggagtgg 
gagcgcgagatcgacaactacaccaacccgatcracaccccgatcgaggagagccagaaccagcag 
gagaagaacgagcaggagctgctggagctggacaactgggccagcctgtggaaccggttcgacatc 
agcaagtggctgtgguacatctaacccgag 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgccgrggagaagccgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgcgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggzggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactaccagacc 

a 9cgtgatcacccaggcccgccccaaggcgagctrcgagcccanccccat:ccactact:gcgccccc 

gccggcttcgccatcctgaagtgcaacgacaagaagctcaacggcagcggcccctgcaccaacgtg 

agcaccgtgcagtgcacccacggcacccgccccgtggtgagcacccagccgctgccgaacggcagc 

ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcatcacc 

atcggccccggccgcgccttcracgccaccggcgacaccaccggcgacauccgccaggcccactgc 

aacatcagcggcgagaagtggaacaacacccrgaagcagatcgtgaccaagctgcaggcccagttc 

ggcaacaagaccatcgtgttcaagcagagcagcggcggcgaccccgagaccgcgatgcacagcttc 

aactgcggcggcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacacc 

atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcarcaaccgc 

tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 

atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 

cccggcggcggcgacatgcgcgacaaccggcgcagcgagctgtacaagtacaaggtggtgaagatc 

gagcccctgggcgtggcccccaccaaggccaucagcagcgcggtgcacagcgagaagagcgccgtg 

accctgggcgccacgt tccrgggctt cctgggcgccgccggcagcaccacgggcgcccgcagcctg 

accccgaccgtgcaggcccgccagcrgctgagcggcatcgtgcagcagcagaacaaccrgctgcgc 

gccatcgaggcccagcagcacctgctgcagccgaccgtgtggggcatcaagcagctgcaggcccgc 

gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggcrgcagcggcaag 

ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 

aacaacatgacctggatggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctg 

accgaggagagccagaaccagcaggagaagaacgagcaggagctgctcgagctggacaagcgggcc 

agcctgtgg'aaccggttcgacatcagcaagtggctgcggtacatctaacr. cgag 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagaccgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgaccaccagcatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 
gacgtggtgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 
acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 
gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 
cagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagcctggccgag 
gagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccaccatcgtgcagctgaag 
gagagcgtggagatca act gcacccgccccaacaacaacacccgcaagagcat caeca tcggcccc 
ggccgcgcctt ctacgccaccggcgacat cat cggcgacatccgccaggcccactgcaacat cage 

ggcgagaagtggaacaacaccetgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 
accatcgtgttcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactgcggc 
ggcgagtt cctctaccgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 
aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 
gtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaacatcaccggc 
ctgctgctgacecgcgacggcggcaaggagatcagcaacaccaccgagatctt ccgccccggcggc 
ggcgacatgegcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatcgagcccctc 
ggcgtggcceccaccatcgccatcagcagcgtggtgcagagcgagaagagegccgtgaccctgggc 
gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcctgaccctgacc 
gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaaectgetgcgcgccatcgag 
gcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgcgtgctggcc 
gtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgc 
accaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagat ctggaacaacatg 
acctggatggagtgggagcgcgagatcgacaactacaccaacctgat "tacaccctgatcgaggag 
agccagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 
aactggt tcgacatcagcaagtggctgtggtacatctaactcgag 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtacracggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagaLcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagacgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctgcgtgaagctgacccccctgtgcgcgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggrgggcgccggcaagccgatcaactgcaacaccagcgrgatcacccaggccrgcccc 
aaggtgagccrcgagcccacccccatccactacrgcgccccccccggcttcgccatcctgaagtgc 
aacgacaagaagttcaacggcagcggcccccgcaccaacgtgagcaccgtgcagtgcacccacggc 
atccgccccgrggtgagcacccagctgctgctgaacggcaccccggccgaggagggcgtggtgatc 
cgcagcgagaacttcaccgacaacgccaagaccatcatcg-gcagctgaaggagagcgtggagatc 
aactgcacccgccccaacaacaacacccgcaagagcatcaccaccggccccggccgcgccttctac 
gccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcggcgagaagtggaac 
aacaccctgaagcagatcgtgaccaagctgcaggcccagtrcggcaacaagaccatcgtgcccaag 
cagagcagcggcggcgaccccgagatcgcgatgcacagctrcaaccgcggcggcgagttcttctac 
tgcaacagcacccagctgttcaacagcacctggaacaacaccaLcggccccaacaacaccaacggc 
accatcaccctgccctgccgcatcaagcagatcaccaaccgrcgccaggaggcgggcaaggccatg 
tacgccccccccatccgcggccagatccgctgcagcagcaacaccaccggcctgctgctgacccgc 
gacggcggcaaggagaccagcaacaccaccgagaccttcccrcccggcggcggcgacatgcgcgac 
aactggcgcagcgagctgtacaagtacaaggtggtgaagarcgagcccctgggcgcggcccccacc 
atcgccaccagcagcgtggcccagagcgagaagagcgccgrgaccczgggcgccatgttcccgggc 
ttcctgggcgccgccggcagcaccacgggcgcccgcagccrgacccrgaccgrgcaggcccgccag 
ctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccarcgaggcccagcagcacctg 
ctgcagctgaccgtgtggggcatcaagcagctgcaggcccccgLgctggccgtggagcgctacctg 
aaggaccagcagcrgctgggcacctggggctgcagcggcaagcrga" ctgcaccaccgccgcgccc 
tggaacgccagc'tggagcaacaagagcccggaccagatccggaacaacatgacctggatggagtgg 
gagcgcgagatcgacaacuacaccaacctgatccacaccccgarccaggagagccagaaccagcag 
gagaagaacgagcaggagccgcrggagccggacaagtgccccagcctcLggaactggttcgacatc 
agcaagtggctgtggcacatccaacccgag 
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gaattcgccaccatggatgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcccgcgtgcccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagccrgtgg 
aaccagaacccgaagccctgcgtgaagctgacccccccgtgcgcgggcgccggcaactgccagacc 
agcgtgatcacccaggcccgccccaaggtgagctccgagcccatccccatccactactgcgccccc 
gccggctzcgccatcctgaagtgcaacgacaagaagctcaacggcagcggcccctgcaccaacgtg 
agcaccgcgcagtgcacccacggcacccgccccgnggcgagcacccagctgctgctgaacggcagc 
ctggccgaggagggcgtggtgatccgcagcgagaacttcaccgacaacgccaagaccatcatcgtg 
cagctgaaggagagcgtggagatcaacrgcacccgccccaacaacaacacccgcaagagcaccacc 
atcggccccggccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccaccgc 
aacatcagcggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 
ggcaacaagaccaccgcgttcaagcagagcagcggcggcgaccccgagaccgtgatgcacagcttc 
a'actgcggcggcgagtccttctactgcaacagcacccagccgttcaacagcacctggaacaacacc 
accggccccaacaacaccaacggcaccatcaccct:gcccr.gccgcatcaagcagatcatcaaccgc 
tggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatccgctgcagcagcaac 
atcaccggcctgctgctgacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgc 
cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 
gagcccctgggcgtggcccccaccatcgccatcagcagcgtggtgcagagcgagaagagcgccgtg 
accctgggccccargttcctgggctr cccgggcgccgccggcagcaccaLgggcgcccgcagcctg 
accccgaccgcgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaaccEgccgcgc 
gccatcgaggcccagcagcacctgctgcagctgaccg^gnggggcatcaagcagctgcaggcccgc 
gtgctggccgtggagcgctaccrgaaggaccagcagctgctgggcatctggggctgcagcggcaag 
ctgatctgcaccaccgccgtgccctggaacgccagctggagcaacaagagcctggaccagatctgg 
aacaacatgacccggatggagtgggagcgcgagancgacaactacaccaacctgatctacaccctg 
atcgaggagagccagaaccagcaggagaagaacgagcaggagctgctggagcrggacaagtgggcc 
agcctgtggaaccggttcgacatcagcaagtggcrgtggtacatctaactcgag 
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gaattcgccaccatggatacaataaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 
ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 
gaggccaccaccaccctgttctgcgccagcgacgccaaggcctacgacaccgaggtgcacaacgtg 
tgggccacccacgcctgcgtacccaccgaccccaacccccaggagatcgtgctggagaacgtgacc 
gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 
gaccagagcctgaagccctacgtgaagctgacccccctgtgcgtgaccctgcactgcaccaacctg 
aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcggcgagatcaagaactgc 
agcttcaaggtgaccaccaacatccgcaacaagatgcagaaggagtacgccctgttctacaagctg 
gacgtggcgcccatcgacaacgacaacaccagctacaagctgatcaactgcaacaccagcgtgatc 
acccaggcctgccccaaggtgagcttcgagcccatccccatccactactgcgcccccgccggcttc 
gccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcaccgtg 
cagtgcacccacggcaccccccccgcggtgagcacccagctgctgctgaacggcagcctggccgag 
aagggcacagtaatccgcagcgagaacttcaccgacaacgccaagaccaLcatcgcgcagccgaag 
aagagcgtggagatcaactacacccgccccaacaacaacacccgcaagagcatcaccatcggcccc 
agccgcgccttctacgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagc 
ggcgagaagtggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttcggcaacaag 
accatcgtattcaagcagagcagcggcggcgaccccgagatcgtgatgcacagcttcaaccgcggc 
agcgagttcttctactgcaacagcacccagctgttcaacagcacctggaacaacaccatcggcccc 
aacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgctggcaggag 
gtgggcaaggccacgcacgccccccccacccgcggccagatccgctgcagcagcaacaccaccggc 
ctgctgctaacccgcgacggcggcaaggagatcagcaacaccaccgagatcttccgccccggcggc 
ggcgacacgcgcgacaactggcgcagcgagctgtacaagtacaaggtggrgaagaccgagcccctg 
ggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgaccctgggc 
gccatgttcctgggcttcctgggcgccgccggcagcaccatgggcgcccgcagcccgaccccgacc 
gtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgccaccgag 
gcccagcaacacctgctgcagccgaccgtgtggggcatcaagcagctgcaggcccgcgtgccggcc 
ctggagcgccacccgaaggaccagcagctgctgggcatctggggctgcagcggcaagctgacctgc 
accaccgccgtgccctggaacgccagctggagcaacaagagcccggaccagatctggaacaacatg 
acctggacggagtgggagcgcgagatcgacaactacaccaacctgatctacaccctgatcgaggag 
aaccaaaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgtgg 
aactggttcgacatcagcaagtggctgtggtacatcaagaccttcatcacgatcgcgggcggcctg 
gtgggcctgcgcatcgtgttcaccgcgctgagcatcgtgaaccgcgtgcgccagggctacagcccc 
ctgagcttccagacccgcttccccgccccccgcggccccgaccgccccgagggcatcgaggaggag 
ggcggcgagcgcgaccgcgaccgcagcagccccctggtgcacggccrgccggccctgatctgggac 
aacctgcgcagcctgtgcctgtncagctaccaccgcctgcgcgacctgaucctgatcgccgcccgc 
atcgtggagctgctgggccgccgcggctgggaggccctgaagtactggggcaacctgccgcagtac 
cggatccaggagctgaagaacagcgccgcgagcctgttcgacgccatcgccaccgccgtggccgag 
ggcaccgaccgcatcatcgaggcggcccagcgcatcggccgcgccctcctgcacatcccccgccgc 

acccgccagggcttcgagcgcgccctgctgtaactcgag 
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gaattcgccaccatggacgcaatgaagagagggctctgctatgcgctgcrgccgtgtggagcagtc 

ttcgtttcgcccagcgccgtggagaagctgtgggtgaccgtgtactacgacgcgcccgtgtggaag 

gaggccaccaccaccccgttctgcgccagcgacgccaaggcctacaacaccaaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcacgccggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaagacatcatcagcctgtgg 

gaecagagcecgaagccctgcgtgaagctgacccccctgtgcgtgaccccacactgcaccaacctg 

aagaacgccaccaacaccaagagcagcaactggaaggagatggaccgcgacgagatcaagaactgc 

agcttcaaggcgggcgccggcaagctgatcaactgcaacaccagcgcaaccacccaggcctgcccc 

aaggcgagcctcgagcccatccccatccactaccgcgcccccgccagcrtcaccatcctgaagtgc 

aacgacaagaagttcaacggcagcggcccctgcaccaacgtgagcacccrQcagtgcacccacggc 

atccgccccgrggtgagcacccagctgctgctgaacggcaacccaaccaaaaagagcgtggcgatc 

cgcagcgagaaccccaccgacaacgccaagaccaccaccatgcagccaaaaaagagcgtagagatc 

aactgcacccgccccaacaacaacacccgcaagagcatcaccaccqaccccagccgcgccctctac 

gccaccggcgacaccatcggcgacatccgccaggcccaccgcaacatcagcagcgagaagtggaac 

aacaccctgaagcagaccgcgaccaagctgcaggcccagttcggcaacaaaaccatcgtgttcaag 

cagagcagcggcggcgaccccgagatcgtgatgcacagcttcaactaccacggcgaqttcttctac 

tgcaacagcacccagctgttcaacagcacccgQaacaacaccatcagcc=caacaa«ccaacggc 

accatcaccccgccctgccgcatcaagcagatcaccaaccactoacaaaagatgggcaaggccatq 

tacgccccccccatccgcggccagatccgctgcagcagcaacatcaccaacctgccgctgacccgc 

gacggcggcaaggagaccagcaacaccaccgagatcttccQccccagcaacagcgacatgcgcaac 

aaccggcgcagcgagctgtacaagtacaaggtggtgaagatcgaacccctgagcgrggcccccacc 

aaggccaagcgccgcgcggtgcagcgcgagaagcgcgccgtgaccctasaccccatgttcctgggc 

ttcctgggcgccgccggcagcaccatgggcgcccgcagcctaaccctaa^cocgcaggcccgccag 

ctgctgagcggcatcgtgcagcagcagaacaacccgctacccaccaccaaaacccaacagcacctg 

ctgcagccgaccgtgtgoggcaccaagcagctgcaggcccgcacQctcsccctggaqcgctacctg 

aaggaccagcagccgccgggcauccggggctgcagcqacaaactaatcrzcaccaccgccgcgcc- 

cggaacgccagccggagcaacaagagcctggaccagacctqqaacaacaugacctqqatggagtgg 

gagcgcgagatcgacaaccacaccaacctgatctacaccccgatcoacaaaaqccagaaccagcag 

gagaagaacgagcaggagctgctggagctggacaagtagaccaacctcraaaaccagctcgacatc 

agcaagtggccgtggtacatcaagatcctcaccatgatcgcaqqcaccciggcggicctgcgcatc 

gtgttcaccgcgctgagcatcgcgaaccgcgngcgccagggctacaaccccctgaqcttccagacc 

cgcttccccgccccccgcggccccgaccgccccgagggcaccgaaaacaagqqcggcqagcgcgac 

cgcgaccgcagcagccccctggcgcacggcctgctggccctqat«ccaacoacctg^gcagcctg 

tgcctgtccagccaccaccgcctgcgcgacccgatcctqaccqccccccacaccqcqgagctactq 

ggccgccgc g gccgggag g ccccgaagcactggggcaaccca«gca==ac=agatccaggaictg 

aagaacagcgccgtgagcctgttcgacgccatcgccatcaccqtoaccaagoacaccgaccgcat? 

atcgaggcggcccagcgcaccggccgcgccttcccgcacacc«CMc=acacccciccaggqctCc 
gagcgcgccc-gctgtaactcgag " " " 33 
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gaattcgccaccatggacgcaatgaagagagggctctgctgtgtgctgctgctgtgtggagcagtc 

ttcgtttcgcccagcgcccriggagaagctgtgggtgaccgtgtactacggcgtgcccgtgtggaag 

gaggccaccaccaccctgctctgcgccagcgacgccaaggcctacqacaccgaggtgcacaacgtg 

tgggccacccacgcctgcgtgcccaccgaccccaacccccaggagatcgcgctggagaacgtgacc 

gagaacttcaacatgtggaagaacaacatggtggagcagatgcacgaggacatcatcagcctgtgg 

gaccagagcccgaagccctgcgtgaagctgacccccctgtgcgtgggcgccggcaactgccagacc 

agcgtgatcacccaggcctgccccaaggtgagcttcgagcccatccccat ccactactgcgccccc 

gccggcttcgccatcctgaagtgcaacgacaagaagttcaacggcagcggcccctgcaccaacgtg 

agcaccgtgcagtgcacccacggcatccgccccgtggtgagcacccagctgctgctgaacggcagc 

ctggccgaggagggcgtggcgatccgcagcgagaacctcaccgacaacgccaagaccatcatcgng 

cagctgaaggagagcgtggagatcaactgcacccgccccaacaacaacacccgcaagagcaccacc 

atcggccccggccgcgccc "ctacgccaccggcgacaccaccggcgacat ccgccaggcccactgc 

aacatcagcggcgagaagcggaacaacaccctgaagcagatcgtgaccaagctgcaggcccagttc 

ggcaacaagaccatcgcgr- caagcagagcagcggcggcgaccccgagat cgcgatgcacagcttc 

aactgcggcggcgagrcc" ctactgcaacagcacccagctgttcaacagcacctggaacaacacc 

atcggccccaacaacaccaacggcaccatcaccctgccctgccgcatcaagcagatcatcaaccgc 

tggcaggaggcgggcaaggccatgtacgccccccccatccgcggccaqat ccgctgcagcagcaac 

accaccggcccgcrgccgacccgcgacggcggcaaggagatcagcaacaccaccgagatct tccgc 

cccggcggcggcgacatgcgcgacaactggcgcagcgagctgtacaagtacaaggtggtgaagatc 

gagcccctgggcgtggcccccaccaaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtg 

accctgggcgccatgttccrgggcctccrgggcgccgccggcagcaccatgggcgcccgcagcctg 

accctgaccgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgc 

gccatcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggcccgc 

gtgctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagcggcaag 

ccgatccgcaccaccgccgrgcccrggaacgccagccggagcaacaagagcctggaccagatctgg 

aacaacacgacctggarggagtgggagcgcgagaccgacaactacaccaacctgatctacaccctg 

atcgaggagagccagaaccagcaggagaagaacgagcaggagcLgctggagctggacaagtgggcc 

agcctgtggaactggttccacatcagcaagtggctgtggtacatcaagatcttcatcatgatcgtg 

ggcggcctggtgggcctgcgcatcgtgttcaccgcgctgagcatcgtgaaccgcgtgcgccagggc 

cacagccccctgagcrtccagacccgctcccccgccccccgcggccccgaccgccccgagggcatc 

gaggaggagggcggcgagcgcgaccgcgaccgcagcagccccctggcgcacggcctgctggccctg 

arctgggacgacctgcgcagcctgtgcctgttcagctaccaccgcctgcgcgacccgatcctgatc 

gccgcccgcat cgtggagccgctgggccgccgcggczgggaggccccgaagt:actggggcaacctg 

crgcagtaccggatccagaagctgaagaacagcgccgtgagcctgttcgacgccaccgccatcgcc 

gcggccgagggcaccgaccgcatcancgaggtggcccagcgcatcggccgcgccttcctgcacatc 

ccccgccgcat ccgccagogctr cgagcgcgccccgccgcaacccqag 
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ACAACAGTCTTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAG 

CAACCACCACTCTGTTTTGTGCATCAGATGCTAAAGCATACAAAGCAGAGGC 

ACATAACGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCAACCCACAG 

GAAGTAAATTTAACAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACA 

TGGTGGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGCCTAAA 

GCCATGTGTAAAATTAACCCCACTCTGTGTTACTTTAAATTGTACTGATAAGT 

TGACAGGTAGTACTAATGGCACAAATAGTACTAGTGGCACTAATAGTACTAG 

TGGCACTAATAGTACTAGTACTAATAGTACTGATAGTTGGGAAAAGATGCCA 

GAAGGAGAAATAAAAAACTGCTCTTTCAATATCACCACAAGTGTAAGAGATA 

AAGTGCAGAAAGAATATTCTCTCTTCTATAAACTTGATGTAGTACCAATAGAT 

AATGATAATGCTAGCTATAGATTGATAAATTGTAATACCTCAGTCATTACACA 

AGCCTGTCCAAAGGTATCTTTTGAACCAATTCCCATACATTATTGTGCCCCGG 

CTGGTTTTGCGATTCTAAAGTGTAAAGATAAGAAGTTCAATGGAACAGGACC 

ATGTAAAAATGTCAGCACAGTACAATGCACACATGGAATTAGACCAGTAGTA 

TCAACTCAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGATAGTACTTA 

GATCTGAAAATTTCACAGACAATGCTAAAACCATAATAGTACAGCTGAATGA 

ATCTGTAGAAATTAATTGTATAAGACCCAACAATAATACAAGAAAAAGTATA 

CATATAGGACCAGGGAGAGCATTTTATGCAACAGGTGATATAATAGGAGACA 

TAAGACAAGCACATTGTAACATTAGTAAAGCAAACTGGACTAACACTTTAGA 

ACAGATAGTTGAAAAATTAAGAGAACAATTTGGGAATAATAAAACAATAATC 

TTTAATTCATCCTCAGGAGGGGACCCAGAAATTGTATTTCACAGTTTTAATTG 

TGGAGGGGAATTTTTCTATTGTAATACATCACAACTATTTAATAGTACCTGGA 

ATATTACTGAAGAGGTAAATAAGACTAAAGAAAATGACACTATCATACTCCC 

ATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGGAAAAGCAAT 

GTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAGGG 

CTGCTATTAACTAGAGATGGTGGTACTAACAATAATAGGACGAACGACACCG 

AGACCTTCAGACCTGGGGGAGGAAACATGAAGGACAATTGGAGAAGTGAAT 

TATATAAATATAAAGTAGTAAGAATTGAACCATTAGGAGTAGCACCCACCCA 

GGCAAAGAGAAGAGTGGTGCAAAGAGAGAAAAGA 
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ACAACAGTCTTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAG 

CAACCACCACTCTGTTTTGTGCATCAGATGCTAAAGCATACAAAGCAGAGGC 

ACATAACGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCAACCCACAG 

GAAGTAAATTTAACAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACA 

TGGTGGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGCCTAAA 

GCCATGTGTAAAATTAACCCCACTCTGTGTTACTTTAAATTGTACTGATAAGT 

TGACAGGTAGTACTAATGGCACAAATAGTACTAGTGGCACTAATAGTACTAG 

TGGCACTAATAGTACTAGTACTAATAGTACTGATAGTTGGGAAAAGATGCCA 

GAAGGAGAAATAAAAAACTGCTCTTTCAATATCACCACAAGTGTAAGAGATA 

AAGTGCAGAAAGAATATTCTCTCTTCTATAAACTTGATGTAGTACCAATAGAT 

AATGATAATGCTAGCTATAGATTGATAAATTGTAATACCTCAGTCATTACACA 

AGCCTGTCCAAAGGTATCTTTTGAACCAATTCCCATACATTATTGTGCCCCGG 

CTGGITTTGCGATTCTAAAGTGTAAAGATAAGAAGTTCAATGGAACAGGACC 

ATGTAAAAATGTCAGCACAGTACAATGCACACATGGAATTAGACCAGTAGTA 

TCAACTCAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGATAGTACTTA 

GATCTGAAAATTTCACAGACAATGCTAAAACCATAATAGTACAGCTGAATGA 

ATCTGTAGAAATTAATTGTATAAGACCCAACAATAATACAAGAAAAAGTATA 

CATATAGGACCAGGGAGAGCATTTTATGCAACAGGTGATATAATAGGAGACA 

TAAGACAAGCACATTGTAACATTAGTAAAGCAAACTGGACTAACACTTTAGA 

ACAGATAGTTGAAAAATTAAGAGAACAATTTGGGAATAATAAAACAATAATC 

TTTAATTCATCCTCAGGAGGGGACCCAGAAATTGTATTTCACAGTTTTAATTG 

TGGAGGGGAATTTTTCTATTGTAATACATCACAACTATTTAATAGTACCTGGA 

ATATTACTGAAGAGGTAAATAAGACTAAAGAAAATGACACTATCATACTCCC 

ATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGGAAAAGCAAT 

GTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAGGG 

CTGCTATTAACTAGAGATGGTGGTACTAACAATAATAGGACGAACGACACCG 

AGACCTTCAGACCTGGGGGAGGAAACATGAAGGACAATTGGAGAAGTGAAT 

TATATAAATATAAAGTAGTAAGAATTGAACCATTAGGAGTAGCACCCACCCA 

GGCAAAGAGAAGAGTGGTGCAAAGAGAGAAAAGAGCAGTGGGACTAGGAG 

CTTTGTTCATTGGGTTCTTGGGAGCAGCAGGAAGCACTATGGGCGCAGCGTC 

AGTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAG 

CAGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCA 

CGGTCTGGGGCATCAAACAGCTCCAGGCAAGAATCCTGGCTGTGGAAAGATA 

CCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGCTCTGGAAAACTCATTT 

GCACCACTACTGTGCCTTGGAACTCTAGTTGGAGTAATAAATCTCTGACTGAG 

ATTTGGGATAATATGACCTGGATGGAGTGGGAAAGAGAAATTGGCAATTATA 

CAGGCTTAATATACAATTTAATTGAAATAGCACAAAACCAGCAAGAAAAGAA 

TGAACAAGAATTATTGGAATTAGACAAGTGGGCAAGTTTGTGGAATTGGTTT 

GATATAACAAACTGGCTGTGGTATATA 
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ACAACAGTCTTGTGGGTCACAGTCTATTATGGGGTACCTGTGTGGAAAGAAG 

CAACCACCACTCTGTTTTGTGCATCAGATGCTAAAGCATACAAAGCAGAGGC 

ACATAACGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCAACCCACAG 

GAAGTAAATTTAACAAATGTGACAGAAAATTTTAACATGTGGAAAAATAACA 

TGGTGGAACAGATGCATGAGGATATAATCAGTTTATGGGATCAAAGCCTAAA 

GCCATGTGTAAAATTAACCCCACTCTGTGTTACTTTAAATTGTACTGATAAGT 

TGACAGGTAGTACTAATGGCACAAATAGTACTAGTGGCACTAATAGTACTAG 

TGGCACTAATAGTACTAGTACTAATAGTACTGATAGTTGGGAAAAGATGCCA 

GAAGGAGAAATAAAAAACTGCTCTTTCAATATCACCACAAGTGTAAGAGATA 

AAGTGCAGAAAGAATATTCTCTCTTCTATAAACTTGATGTAGTACCAATAGAT 

AATGATAATGCTAGCTATAGATTGATAAATTGTAATACCTCAGTCATTACACA 

AGCCTGTCCAAAGGTATCTTTTGAACCAATTCCCATACATTATTGTGCCCCGG 

CTGGTTTTGCGATTCTAAAGTGTAAAGATAAGAAGTTCAATGGAACAGGACC 

ATGTAAAAATGTCAGCACAGTACAATGCACACATGGAATTAGACCAGTAGTA 

TCAACTCAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGATAGTACTTA 

GATCTGAAAATTTCACAGACAATGCTAAAACCATAATAGTACAGCTGAATGA 

ATCTGTAGAAATTAATTGTATAAGACCCAACAATAATACAAGAAAAAGTATA 

CATATAGGACCAGGGAGAGCATTTTATGCAACAGGTGATATAATAGGAGACA 

TAAGACAAGCACATTGTAACATTAGTAAAGCAAACTGGACTAACACTTTAGA 

ACAGATAGTTGAAAAATTAAGAGAACAATTTGGGAATAATAAAACAATAATC 

TTTAATTCATCCTCAGGAGGGGACCCAGAAATTGTATTTCACAGTTTTAATTG 

TGGAGGGGAATTTTTCTATTGTAATACATCACAACTATTTAATAGTACCTGGA 

ATATTACTGAAGAGGTAAATAAGACTAAAGAAAATGACACTATCATACTCCC 

ATGCAGAATAAGACAAATTATAAACATGTGGCAAGAAGTAGGAAAAGCAAT 

GTATGCCCCTCCCATCAGAGGACAAATTAAATGTTCATCAAATATTACAGGG 

CTGCTATTAACTAGAGATGGTGGTACTAACAATAATAGGACGAACGACACCG 

AGACCTTCAGACCTGGGGGAGGAAACATGAAGGACAATTGGAGAAGTGAAT 

TATATAAATATAAAGTAGTAAGAATTGAACCATTAGGAGTAGCACCCACCCA 

GGCAAAGAGAAGAGTGGTGCAAAGAGAGAAAAGAGCAGTGGGACTAGGAG 

CTTTGTTCATTGGGTTCTTGGGAGCAGCAGGAAGCACTATGGGCGCAGCGTC 

AGTGACGCTGACGGTACAGGCCAGACAATTATTGTCTGGTATAGTGCAACAG 

CAGAACAATTTGCTGAGAGCTATTGAGGCGCAACAGCATCTGTTGCAACTCA 

CGGTCTGGGGCATCAAACAGCTCCAGGCAAGAATCCTGGCTGTGGAAAGATA 

CCTAAAGGATCAACAGCTCCTAGGGATTTGGGGTTGCTCTGGAAAACTCATTT 

GCACCACTACTGTGCCTTGGAACTCTAGTTGGAGTAATAAATCTCTGACTGAG 

ATTTGGGATAATATGACCTGGATGGAGTGGGAAAGAGAAATTGGCAATTATA 

CAGGCTTAATATACAATTTAATTGAAATAGCACAAAACCAGCAAGAAAAGAA 

TGAACAAGAATTATTGGAATTAGACAAGTGGGCAAGTTTGTGGAATTGGTTT 

GATATAACAAACTGGCTGTGGTATATAAGAATATTCATAATGATAGTAGGAG 

GCTTGATAGGTTTAAGAATAGTTTTTGCTGTACTTTCTATAGTGAATAGAGTT 

AGGCAGGGATACTCACCAATATCATTGCAGACCCGCCTCCCAGCTCAGAGGG 
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GACCCGACAGGCCCGAAGGAATCGAAGAAGAAGGTGGAGAGAGAGACAGA 

GACAGATCCAATCGATTAGTGCATGGATTATTGGCACTCATCTGGGACGATCT 

GCGGAGCCTGTGCCTCTTCAGCTACCACCGCTTGAGAGACTTACTCTTGATTG 

TAGCGAGGATTGTGGAACTTCTGGGACGCAGGGGGTGGGAAGCCCTCAAGTA 

TTGGTGGAATCTCCTGCAGTATTGGAGTCAGGAGCTAAAGAGTAGTGCTGTT 

AGTTTGTTTAATGCCACAGCAATAGCAGTAGCTGAAGGGACAGATAGGATTA 

TAGAAATAGTACAAAGAATTTTTAGAGCTGTAATTCACATACCTAGAAGAAT 

AAGACAGGGCTTGGAGAGGGCTTTACTATAA 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGA-AGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGA.AGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAG.AAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCA-ACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAG.AACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

A GC A GC AA T ATTA CCGGCCTGCTGCTG A CCCGCG A CGGCGGC A CCAACAACAACCGC A CCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCA-ACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAA GGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCTAAGATATCGGATCCTCTAGA 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 

AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 

TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 

AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 

CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 

TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 

AAGCTGACCCCCCTGTGCGTGGGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGC 

CTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 

CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 

ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGG 

CAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGA 

CCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAAC 

ACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCAT 

CGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCG 

AGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 

AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTT 

CTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGA 

ACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAAC 

ATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTG 

CAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCA 

CCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 

GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGC 

CAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCTAAGATATCGGATCCTCTAGA 
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GAATTCGCC^CCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACC.AACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

gaaggagtacagcctgttctacaagctggacgtggtgcccatcgacaacgacaacgccagct 

accgcctgatcaactgcaacaccagcgtgatcacccaggcctgccccaaggtgagcttcgagc 

ccatccccatccactactgcgcccccgccggcttcgccatcctgaagtgcaaggacaagaagt 

tcaacggcaccggcccctgcaagaacgtgagcaccgtgcagtgcacccacggcatccgcccc 

gtggtgagcacccagctgctgctgaacggcagcctggccgaggaggagatcgtgctgcgctc 

cgagaacttcaccgacaacgccaagaccatcatcgtgcagctgaacgagtccgtggagatca 

actgcatccgccccaacaacaacacgcgtaagagcatccacatcggccccggccgcgccttct 

acgccaccggcgacatcatcggcgacatccgccaggcccactgcaacatcagcaaggccaac 

tggaccaacaccctcgagcagatcgtggagaagctgcgcgagcagttcggcaacaacaagac 

catcatcttcaacagcagcagcggcggcgaccccgagatcgtgttccacagcttcaactgcgg 

cggcgagttcttctactgcaacaccagccagctgttcaacagcacctggaacatcaccgagga 

GGTCAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

acatgtggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatcaagtgc 

agcagcaatattaccggcctgctgctgacccgcgacggcggcacc.\.acaacaaccgcaccaa 

cgacaccgagaccttccgccccggcggcggcaacatgaaggac.\.\ctggcgcagcgagctgt 

^caagtaca-aggtggtgcgcatcgagcccctgggcgtggcccccacccaggccaagcgccxsc 

gtggtgcagcgcgagaagcgcgccgtgggcctgggcgccctgttcatcggcttcctgggcgcc 

gccgggagcaccatgggcgccgcctccgtgaccctgaccgtgcaggcccgccagctgctgag 

cggcatcgtgcagcagcagaacaacctgctgcgcgccatcgaggcccagcagcacctgctgc 

agctgaccgtgtggggcatcaagcagctgcaggcccgcatcctggccgtggagcgctacctg 

aaggaccagcagctgctgggcatctggggctgcagcggcaagctgatctgcaccaccaccgt 

gccctggaacagcagctggagcaacaagagcctgaccgagatctgggacaacatgacctgga 

tggagtgggagcgcgagatcggcaactacaccggcctgatctac.vacctgatcgagatcgcc 

cagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgt 

ggaactggttcgacatcaccaactggctgtggtacatctaagatatcggatcctctaga 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

ACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCA 

A CTGC A TCCGCCCC AACAACAAC A CGCGTAAGAGCATCC AC ATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGG 

CGGCG AGTTCTTCTA CTGC AA CA CCA G CCA G CTGTTCAAC A GCACCTGGAA CATC A CCG A GG A 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGAGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 

GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGT 

GCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTAC^ACCTGATCGAGATCGCC 

CAGAACCAGCAGGAGA-^GAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGGATCCTCTAGA 
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G^^cSotGG^^^ 

a rrrr a CCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAAC 
TrGACCAACACCCTCGAGCAGATCGTG 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCG 



rr a rArrGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGT 
^^^™GCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 

gc?ggSag?a?catcg^cgcc^ 

?GG?A?^CA C £^ 

g^gg^ 

GA 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGC 

TGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACC 

GTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCG 

CCAGCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTGGGCCACCCA 

CGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACCTGACCAACGTG 

ACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCC 

AGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 

CGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGC 

CCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGG 

TGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCT 

GCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAC 

GAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGCA 

TCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGA 

CATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTC 

GAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATC 

ATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCA 

ACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCA.\CAGCAC 

CTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 

CCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAAG 

GCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAATATTA 

CCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACGA 

CACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 

GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCA 

CCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGG 

GCGCCCTGTTC ATCGGCTTC CTGGGCGCCGCCGGG AGC A CC ATGGGCGCCGC 

CTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAG 

CAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGC 

TGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 

ATCTGCACCACCACCGTGCCCTGGA-^CAGCAGCTGGAGCAACAAGAGCCTGA 

CCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCA 

ACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGA 

GAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAA 

CTGGTTCGACATCACCA.^CTGGCTGTGGTACATCTAAGATATCGGATCCTCTA 

GA 

FIG. 46 

(SEO ID NO:59) 



SUBSTITUTE SHEET (RULE 26) 



WO 00/39302 



PCT/US99/31245 



59/131 



Gpl40modUS4.DV2 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGC 

TGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACC 

GTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCG 

CCAGCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTGGGCCACCCA 

CGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACCTGACCAACGTG 

ACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCC 

CCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGG 

CACCAACAGCACCAGCGGCACCAACAGCACCAGCGGCACCAACAGCACCAG 

CACCAACAGCACCGACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAA 

CTGCAGCTTCAACATCGGCGCCGGCCGCCTGATCAACTGCAACACCAGCGTG 

ATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACT 

GCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGG 

CACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGC 

CCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGA 

TCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCA 

GCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGT 

AAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCA 

TCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAA 

CACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAA 

GACCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCAC 

AGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAA 

CAGCACCTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACAC 

CATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTG 

GGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCA 

ATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCAC 

CAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTG 

GCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTG 

GCCCCCACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTG 

GGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGG 

GCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCAT 

CGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTG 

CTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCG 

TGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGG 

CAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAG 

AGCCTGACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAG 

ATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 

AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCC 

TGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGG 

ATCCTCTAGA 

FIG. 47 

(SEO ID NO:60) 



SUBSTITUTE SHEET (RULE 26) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGC 

TGTGTGGAGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACC 

GTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCG 

CCAGCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTGGGCCACCC 

ACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACCTGACCAACGT 

GACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGA 

GGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGC 

CAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCC 

CCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGG 

CCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTG 

GTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGC 

TGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAA 

CGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGC 

ATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCG 

ACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCT 

CGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCAT 

CATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTC 

AACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCA 

CCTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCA 

TCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAA 

GGCCATGTACGCCCCCCCCATCCGCGGCCAGATC.AAGTGCAGCAGCAATATT 

ACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACG 

ACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCA 

GCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCC 

CACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGAGCGCCGTGGGCCT 

GGGCGCCCTGTTCATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCC 

GCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGC 

AGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCA 

GCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAG 

CGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGC 

TGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCT 

GACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGG 

CAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAG 

GAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGG 

AACTGGTTCGACATCACCAACTGGCTGTGGTACATCTAAGATATCGGATCCTC 

TAGA 

FIG. 48 

(SEO ID NO:61) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 

AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 

TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 

AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 

CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 

TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 

AAGCTGACCCCCCTGTGCGTGGGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGC 

CTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 

CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 

ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGG 

CAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGA 

CCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAAC 

ACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCAT 

CGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCG 

AGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 

AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTT 

CTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGA 

ACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAAC 

ATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTG 

CAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCA 

CCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 

GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGC 

CAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCG 

GCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAG 

GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 

GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCA 

TCCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 

GGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCT 

GACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACA 

CCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAG 

GAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTG 

GCTGTGGTACATCTAAGATATCGGATCCTCTAGA 

FIG. 49 

(SEO ID NO:62) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 
AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGGGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGC 
CTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 
CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 
ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGG 
CAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGA 
CCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAAC 
ACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCAT 
CGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCG 
AGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 
.AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTT 
CTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGA 
ACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAAC 
ATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTG 
CAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCA 
CCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGC 
GAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGC 
CAAGCGCCGCGTGGTGCAGCGCGAGAAGAGCGCCGTGGGCCTGGGCGCCCTGTTCATCG 
GCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAG 
GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCCSA 
GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCA 
TCCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 
GGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCT 
GACCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACA 
CCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAG 
GAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTG 
GCTGTGGTACATCTAAGATATCGGATCCTCTAGA 

FIG. 50 

(SEO ID NO:63) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACCAACAGCACCAGCGGCAC 

CAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACCGACAGCTGGGAGAAGATG 

CCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACGCCAGCT 

A CCG CCTG ATCAA CTGC AA CACC AG CGTGATC A CCC A GGCCTGCCCCAAGGTGAGCTTCG A GC 

CCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGA^GTGCAAGGACAAGAAGT 

TCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCC 

GTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTC 

CGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGA.^CGAGTCCGTGGAGATCA 

ACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT 

ACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCA.^CATCAGCAAGGCCAAC 

TGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGAC 

CATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCITCAACTGCGG 

CGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGA 

GGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCA 

ACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGC 

AGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACC^CAACAACCGCACCAA 

CGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGAC^CTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 

GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCA.AGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCACCGT 

GCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTAC.-V.ACCTGATCGAGATCGCC 

cagaaccagcaggagaagaacgagcaggagctgctggagctggacaagtgggccagcctgt 

GGAACTGGTTCGACATCACCA-ACTGGCTGTGGTACATCCGCATCTTCATCATGATCGTGGGCG 

GCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTG.^CCGCGTGCGCCAGGGCT 

ACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCCCGAGGGC 

^TCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAACCGCCTGGTGCACGGCCTGCT 

GGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCT 

GCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGT 

ACTGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTC 

AACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCAT 

CTTCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTA 

AGATATCGGATCCTCTAGA 

FIG. 51 

(SEO ID NO:64) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACITCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGAACTGCACCGACAAGCTGGGCGCCGGCGGCGAGATCAAGAACTGCAGCTTCAACAT 

CACCACCAGCGTGCGCGACAAGGTGCAGAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGG 

TGCCCATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCAACACCAGCGTGATCACCC 

AGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCG 

CCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGCACC 

GTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTG 

GCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGT 

GCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGCA 

TCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGG 

CCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTG 

CGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCOA 

GATCGTGTTCCACAGCTTC.AACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTT 

CAACAGCACCTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAj\CGACACCATCATCC 

TGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCC 

ccccccatccgcggccagatcaagtgcagcagcaatattaccggcctgctgctgacccgcgac 

ggcggcaccaacaacaaccgcaccaacgacaccgagaccttccgccccggcggcggcaacat 

gaaggacaactggcgcagcgagctgtacaagtacaaggtggtgcgcatcgagcccctgggcg 

tggcccccacccaggccaagcgccgcgtggtgcagcgcgagaagcgcgccgtgggcctgggc 

gccctgttcatcggcttcctgggcgccgccgggagcaccatgggcgccgcctccgtgaccctg 

accgtgcaggcccgccagctgctgagcggcatcgtgcagcagcagaacaacctgctgcgcgc 

catcgaggcccagcagcacctgctgcagctgaccgtgtggggcatcaagcagctgcaggccc 

gcatcctggccgtggagcgctacctgaaggaccagcagctgctgggcatctggggctgcagc 

GGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACA.AGAGCCTGAC 

cgagatctgggacaacatgacctggatggagtgggagcgcgagatcggcaactacaccggcc 
tgatctacaacctgatcgagatcgcccagaaccagcaggagaagaacgagcaggagctgctg 

GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATC 

cgcatcttcatcatgatcgtgggcggcctgatcggcctgcgcatcgtgttcgccgtgctgagc 

atcgtgaaccgcgtgcgccagggctacagccccatcagcctgcagacccgcctgcccgcccag 

cgcggccccgaccgccccgagggcatcgaggaggagggcggcgagcgcgaccgcgaccgca 

gcaaccgcctggtgcacggcctgctggccctgatctgggacgacctgcgcagcctgtgcctgt 

tcagctaccaccgcctgcgcgacctgctgctgatcgtggcccgcatcgtggagctgctgggcc 

gccgcggctgggaggccctgaagtactggtggaacctgctgcagtactggagccaggagctg 

aagagcagcgccgtgagcctgttcaacgccaccgccatcgccgtggccgagggcaccgaccg 

catcatcgagatcgtgcagcgcatcttccgcgccgtgatccacatcccccgccgcatccgcca 

gggcctggagcgcgccctgctgtaagatatcggatcctctaga 

FIG. 52 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTAC 
AAGGCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCATGAGGACATCATCAGCCTC-TGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAA 
CGGCACCAACAGCACCAGCGGCACCAACAGCACCAGCGGCACCAACAGCACCAGCACCA 
ACAGCACCGACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACTGCAGCTTCAAC 
ATCGGCGCCGGCCGCCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAA 
GGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGA 
AGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAG 
TGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGC 
CGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCG 
TGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAG 
AGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACAT 
CCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAGATCG 
TGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAACAGCAGCAGC 
GGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTG 
CAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGAACAAGACCA 
AGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAG 
GAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAA 
TATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACGACA 
CCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGTAC 
AAGTACAAGGTGGTGCGCATCGAGCCCCTGC-C-CGTGGCCCCCACCCAGGCCAAGCGCCG 
CGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGG 
GCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAG 
CTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 
GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCG 
TGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 
ATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGAT 
CTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGA 
TCTACAACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTG 
GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTA 
CATCCGCATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCG 
TGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGC 
CTGCCCGCCCAGCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCG 
CGACCGCGACCGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACC 
TGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCC 
CGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCT 
GCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCG 
CCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCTTCCGC 
GCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGA 

TATCGGATCCTCTAGA 

FIG. 53 

(SEO ID NO:66) 



SUBSTITUTE SHEET (RULE 26) 



WO 00/39302 
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gpl60.modUS4delVl/2 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCCAGGCCTGCCC 

CAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAA 

GTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCA 

CCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAG 

GAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAA 

CGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCG 

GCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCA 

ACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAG 

TTCGGCAACAACAAGACCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTT 

CCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCAC 

CTGGAACATCACCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCATCCTGCCCTGCC 

GCATCCGCCAGATCATCAACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATC 

CGCGGCCAGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCAC 

CAACAACAACCGCACCAACGACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACA 

ACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCC 

ACCCAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTT 

CATCGGCTTCCTGGGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCA 

GGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGG 

CCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTG 

GCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCT 

GATCTGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCT 

GGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTAC 

AACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGG 

ACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCT 

TCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTGA 

ACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGCGGC 

CCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAACC 

GCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCT 

ACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTGGGCCGCCGCG 

GCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGC 

AGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATC 

GAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTG 

GAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

FIG. 54 

(SEO ID NO:67) 
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gP 160.modUS4 del 128-194 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAGGCCGAGGC 

CCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTGAACC 

TGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCATGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCGAGACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCC^TCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTT 

CAACGGCACCGGCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCC 

GAGAAOTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAA 

CTGCATCCGCCCCAACAACAACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACT 

GGACCAACACCCTCGAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACC 

ATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGC 

GGCGAGTTCTTCTACTGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAG 

gtgaacaagaccaaggagaacgacaccatcatcctgccctgccgcatccgccagatcatcaa 

catgtggcaggaggtgggcaaggccatgtacgccccccccatccgcggccagatcaagtgca 

gcagcaatattaccggcctgctgctgacccgcgacggcggcaccaacaacaaccgcaccaac 

gacaccgagaccttccgccccggcggcggcaacatgaaggacaactggcgcagcgagctgta 

caagtacaaggtggtgcgcatcgagcccctgggcgtggcccccacccaggccaagcgccgcg 

tggtgcagcgcgagaagcgcgccgtgggcctgggcgccctgttcatcggcttcctgggcgccg 

ccgggagcaccatgggcgccgcctccgtgaccctgaccgtgcaggcccgccagctgctgagc 

GGCATCGTGCAGCAGCAGAAC.AACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCA 

gctgaccgtgtggggcatcaagcagctgcaggcccgcatcctggccgtggagcgctacctga 

^GGACCAGCAGCTGCTGGGCATCTGGGGCTGC^GCGGCAAGCTGATCTGCACCACCACCGTG 

CCCTGGAACAGCAGCTGGAGC.A-^CAAGAGCCTGACCGAGATCTGGGACAACATGACCTGGAT 

GGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTGATCGAGATCGCCC 

AGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTG 

GAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGG 

CCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTA 

CAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCCCGAGGGCA 

TCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAACCGCCTGGTGCACGGCCTGCTG 

GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTG 

CTGCTG^TCGTGGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTAC 

TGGTGGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAA 

CGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCTT 

CCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGA 

TATCGGATCCTCTAGA 

FIG. 55 

(SEO ID NO:68) 
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Ia^actIi^ 

Iaaagcaatgtatgcccctcccatcagaggacaaattaaatgttcatcaaatattacag 
ggctgctattaactagagatggtggt 

FIG. 56 

(SEO ID NO;69) 



Env_SF162_C4wt 

GGAACTATCACACTCCCATGCAGAATAAAACAAATTATAAACAGGTGGCAGGAAGTAGG 
AAAAGCAATGTATGCCCCTCCCATCAGAGGACAAATTAGATGCTCATCAAATATTACAG 

GACTGCTATTAACAAGAGATGGTGGT 

FIG. 57 

(SEO ID NO:70) 



GACACCA^ 

CAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCA^GTGCAGCAGCAACATCACCG 
GCCTGCTGCTGACCCGCGACGGCGGC 

FIG. 58 

(SEO IDN0:71) 



GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGG 
CAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCG 

GCCTGCTGCTGACCCGCGACGGCGGC 

FIG. 59 

(SEO ID NO:72) 



SUBSTITUTE SHEET /'RULE 26) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
GCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAG 
GCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 
GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
CAGATGCATGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGACCGGCAGCACCAACGGCACC 
AACAGCACCAGCGGCACCAACAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACC 
GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACTGCAGCTTCAACATCACCACC 
AGCGTGCGCGACAAGGTGCAGAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCAACACCAGCGTGATCACCCAG 
GCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGAGC 
ACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCCAAGACC 
ATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGC 
GACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAACAGCAGC 
AGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTAC 
TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGAACAAGACC 
AAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAG 
GAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGCAGCAAT 
ATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 
GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTGTACAAG 
TACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 
GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTGGGCGCC 
GCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCTG 
AGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTG 
CTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGC 
TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACC 
ACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAAC 
ATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTACAACCTG 
ATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAG 
TGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGCATCTTC 
ATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGTG 
AACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCCCAGCGC 
GGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGC 
AACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTG 
TTCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAGCTGCTG 
GGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGAGCCAG 
GAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCCGAGGGC 
ACCGACCGCATCATCGAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATCCCCCGC 
CGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGAGAATTC 

FIG. 61A 

(SEO ID NO:73) 



SfJRSTlTUTE SHEET (RULE 26) 



WO 00/39302 



PCT/US99/31245 



71 / 1 31 



CGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAGCCGC 

TTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCTTTT 

GGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTT 

TCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTG 

GAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCA 

CCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCG 

GCACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCC 

TCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCT 

GATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTA 

GGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATAATACCATGGGCGC 

CCGCG^CAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCC 

CGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGGGCCAGCCGCGAGCTGGAGCG 

CTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCCGCCAGATCCTGGGCCA 

GCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGCAGCCTGTACAACACCGTGGC 

CACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGACACCAAGGAGGCCCTGGAGAA 

GATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGCCGCCGG 

CACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGCAGGGCCA 

GATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGA 

GAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCCCTGAGCGAGGGCGCCACCCC 

CCAGGAC r TGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCT 

GAAGGAGACCATGAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGG 

CCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAG 

CACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGGCGAGAT 

CTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCCCACCAG 

CATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTA 

CAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAGAACTGGATGACCGAGACCCT 

GCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGAAGGCTCTCGGCCCCGCGGC 

CACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGGCCCG 

CGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCGACCATCATGATGCAGCGCGG 

CAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACAC 

CGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGCGCTGCGGCCGCGAGGGCCA 

CCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCAGCTA 

CAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGAGGA 

GAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCCAGAAGCAGGAGCCCATCGACAA 

GGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCGGCAACGACCCCAGCAGCCAGTA 

AGAATTCAGACTCGAGCAAGTCTAGA 

FIG. 61B 

(SEO ID NO:73) 



SUBSTITUTE SHEET (RULE 26) 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGG 
AGCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCG 
TGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTAC 
GACACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCC 
CCAGGAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGG 
TGGAGCAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTG 
AAGCTGACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGAACGCCACCAACAC 
CAAGAGCAGCAACTGGAAGGAGATGGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGG 
TGACCACCAGCATCCGCAACAAGATGCAGAAGGAGTACGCCCTGTTCTACAAGCTGGAC 
GTGGTGCCCATCGACAACGACAACACCAGCTACAAGCTGATCAACTGCAACACCAGCGT 
GATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCC 
CCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGC 
ACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCT 
GCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCG 
ACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGC 
CCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCAC 
CGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGA 
ACAACACCC7GAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 
GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGG 
CGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCA 
TCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATC 
AACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCG 
CTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCA 
ACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAG 
CTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAA 
GCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCT 
TCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCC 
CGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGC 
CCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGC 
TGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGC 
AAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGA 
CCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCA 
ACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAG 
CTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCT 
GTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGT 
TCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAG 
ACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 
CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 
ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATC 
GGCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGG 
CAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACG 
CCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATC 
GGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCT 

FIG. 62A 

(SEO ID NO:74) 



SUBSTITUTE SHEET (RULE 26) 



WO 00/39302 



73 / 1 3 1 



PCT/US99/31245 



GTAACTCGAGCAAGTCTAGAGAATTCCGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCC 
CCCCCTAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATAT 
GTTATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTG 
TCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTG 
TTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGT 
AGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAA 
AGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCCACGTTGTGAGT 
TGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAA 
GGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCT 
TTACATGTGTTTAGTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTG 
GTTTTCCTTTGAAAAACACGATAATACCATGC-GCGCCCGCGCCAGCGTGCTGAGCGGCG 
GCGAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAG 
CTGAAGCACATCGTG7GGGCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCT 
GCTGGAGACCAGCGAGGGCTGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGA 
CCGGCAGCGAGGAGCTGCGCAGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCAC 
CAGCGCATCGACGTCAAGGACACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAA 
CAAGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCC 
AGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCC 
ATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTC-GAGGAGAAGGCCTTCAGCCC 
CGAGGTGATCCCCATGTTCAGCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTGAACA 
CGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATC 
AACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCC 
CGGCCAGA7GCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGG 
AGCAGATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGG 
TGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCCCACCAGCATCCTGGA 
CATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTACAAGACCC 
TGCGCGCTGAGCAGGCCAGCCAGGACGTGAAGAACTGGATGACCGAGACCCTGCTGGTG 
CAGAACGCCAACCCCGACTGCAAGACCATCCTGAAGGCTCTCGGCCCCGCGGCCACCCT 
GGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGGCCCGCGTGC 
TGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCGACCATCATGATGCAGCGCGGCAAC 
TTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACACCGC 
CAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGCGCTGCGGCCGCGAGGGCCACC 
AGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCAGCTAC 
AAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGAGGA 
GAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCCAGAAGCAGGAGCCCATCGACA 
AGGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCGGCAACGACCCCAGCAGCCAG 
TAAGAATTCAGACTCGAGCAAGTCTAGA 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
GCAGTCTTCGTTTCGCCCAGCGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCTTACAAG 
GCCGAGGCCCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 
GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
CAGATGCATGAGGACZATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCC 
GGCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCC 
GGCTTCGCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAAC 
GTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTG 
AACGGCAGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACTTCACCGACAACGCC 
AAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCATCCGCCCCAACAAC 
AACACGCGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATC 
ATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCAAGGCCAACTGGACCAACACCCTC 
GAGCAGATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGACCATCATCTTCAAC 
AGCAGCAGCGGCGGCGACCCCGAGATCGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTC 
TTCTACTGCAACACCAGZCAGCTGTTCAACAGCACCTGGAACATCACCGAGGAGGTGAAC 
AAGACCAAGGAGAACGACACCATCATCCTGCCCTGCCGCATCCGCCAGATCATCAACATG 
TGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCAAGTGCAGC 
AGCAATATTACCGGCCTGCTGCTGACCCGCGACGGCGGCACCAACAACAACCGGACCAAC 
GACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCAGCGAGCTG 
TACAAGTACAAGGTGGTGCGCATCGAGCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGC 
CGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGTTCATCGGCTTCCTG 
•GGCGCCGCCGGGAGCACCATGGGCGCCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAG 
CTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAG 
CACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTG 
GAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATC 
TGCACCACCACCGTGCCCTGGAACAGCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGG 
GACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTACACCGGCCTGATCTAC 
AACCTGATCGAGATCGCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTG 
GACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCCGC 
ATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGC 
ATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCATCAGCCTGCAGACCCGCCTGCCCGCC 
CAGCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 
CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTG 
TGCCTGTTCAGCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 
CTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGG 
AGCCAGGAGCTGAAGAGCAGCGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 
GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCTTCCGCGCCGTGATCCACATC 
CCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 
GAATTCCGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCCCCCCGTAACGTTACTGGCCGA 
AGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCG 
TCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGG 
GGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTT 
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CCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAAC 
CCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCA 
^-tv.^™ r*i— rT^TnR(r;TTr:r;^T&rJTTr;TGGAAAGAGTCAAATGG 



GATGCTGAAGGAGACLAl u^^mjVj/i.(jtjui_\juuo.n.ij j. uuunuw>.v 

CGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCAC 

CACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGG 

CGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCC 

CACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCG 

CTTCT-ACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAGAACTGGATGACCGA 

GACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGAAGGCTCTCGGCCC 

CGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAA 

GGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCGACCATCATGATGCA 

GCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAACTGCGGCAAGGAGGG 

CCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGCGCTGCGGCCGCGA 

GGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCC 

CAGCTACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCC 

CGAGGAGAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCCAGAAGCAGGAGCCCAT 

CGACAAGGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCGGCAACGACCCCAGCAG 

CCAGTAAGAATTCAGACTCGAGCAAGTCTAGA 
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GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGA 
GCAGTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTG 
CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGAC 
ACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 
GAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAG 
CAGATGCACGAGGACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 
ACCCCCCTGTGCGTGACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGC 
AGCAACTGGAAGGAGATGGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGGGCGCC 
GGCAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTC 
GAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGAC 
AAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGC 
ATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTG 
GTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAG 
AGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGC 
CCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGC 
AACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 
CAGTTCGGCAACAAGACCATCGTGT7CAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTG 
ATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 
AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGC 
CGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCC 
ATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGC 
GGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 
AACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCC 
CCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCC 
ATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTG 
ACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGC 
GCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAG 
GCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGC 
TGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAG 
AGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAAC 
TACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAG 
CAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAG 
TGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATC 
GTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 
CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGC 
GGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGG 
GACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATC 
GCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGC 
AACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCC 
ATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGC 
CGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAA 
CTCGAGCAAGTCTAGAGAATTCCGCCCCCCCCCCCCCCCCCCCTCTCCCTCCCCCCCCCC 
TAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATT 
TTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTT 
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GACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAATGT 

CGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCT 

TTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGT 

ATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCC^CGTTGTGAGTTGGATAGTTGT 

GGAAAGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAA 

GGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTA 

GTCGAGGTTAAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAA 

AACACGATAATACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGT 

GGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGT 

GGGCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGG 

GCTGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGC 

GCAGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGG 

ACACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCC 

AGCAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCA 

TCGTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACG 

CCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCG 

CCCTGAGCGAGGGCGCCACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCC 

ACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACC 

GCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCA 

GCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACC 

CCCCCATCCCCGTGGGCGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCG 

TGCGGATGTACAGCCCCACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCC 

GCGACTACGTGGACCGCTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGA 

AGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCC 

TGAAGGCTCTCGGCCCCGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGG 

GCGGCCCCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGG 

CGACCATCATGATGCAGCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCA 

ACTGCGGCAAGGAGGGCCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCT 

GGCGCTGCGGCCGCGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCC 

TGGGCAAGATCTGGCCCAGCTACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCG 

AGCCCACCGCCCCCCCCGAGGAGAGCTTCCGCTTCGGCGAGGAGAAGACCACCCCCAGCC 

AGAAGCAGGAGCCCATCGACAAGGAGCTGTACCCCCTGACCAGCCTGCGCAGCCTGTTCG 

GCAACGACCCCAGCAGCCAGTAAGAATTCAGACTCGAGCAAGTCTAGA 
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end of 

gpl60 (8 

gpl60 del VI (8 

gpl60 del V2 (8 

gpl60 del Vl-2 (8 

gp 160 del 128-194 (8 

gpl4 0TM (8 

gpl40 (8 

gpl40mut {8 

gpl20 (8 

Consensus (8 

gp 160 (12 

gpl60 del VI (12 

gpl60 del V2 (12 

gpl60 del Vl-2 (12 

gp 160 del 128-194 (12 

gpl40TM (12 

gpl40 (12 

gpl40mut (12 

gpl20 (12 

Consensus (12 



. Start of tPA 

i i 40 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 
GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCT 

41 80 
GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 

GTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 



PA 



81 i 120 

CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGG7GACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 
CGCCACCACCGTGCTGTGGGTGACCGTGTACTACGGCGTG 

121 160 
CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 

CCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCA 
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161 98/ 131 200 

GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 
GCGACGCCAAGGCTTACAAGGCCGAGGCCCACAACGTGTG 

201 240 
GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

GGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAG 

241 280 
GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTGACCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGTG.-.CCGAGAACTTCAACATGT 

GAGGTGAACCTGACCAACGT5ACCGAGAACTTCAACATGT 

GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 
GGAAGAACAACATGGTGGAGCAGATGCATGAGGACATCAT 

321 360 
(321 ) CAGCCTGTGGGACCAGAGCC73AAGCCCTGCGTGAAGCTG 

(321) CAGCCTGTGGGACCAGAGCC7GAAGCCCTGCGTGAAGCTG 

(321) CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

(321 ) CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGOCCCC 

(321) CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

(321) CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

(321) CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

(321 ) CAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTG 

( 321 ) CAGCCTGTGGGACCAGAGCC7GAAGCCCTGCGTGAAGCTG 

(321) CAGCCTGTGGGACCAGAGCC7G AAGCCCTGCGTGAAGCTG 



(161) 
(161) 
(161) 
(161) 
(161) 
(161) 
(161) 
(161) 
(161) 
(161) 

(201) 
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(201) 
(201) 
(201) 
(201) 
(201) 
(201) 
(201) 
(201) 

(241) 
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(241) 
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(281) 



FIG. 66B-2 



WO 00/39302 



PCT/US99/31245 



QD 160 (361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 

„ol60delV (361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGG 

Zltl del V2 (361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 

qpl60 del Vl-2 (361) GGC ""Ill V. 

ifin del 128-194 (361) ACCCCCCTGTGCGTGGGGGCAGGG 

gp 160 del l^b X ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 
aol C 361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 

aD 140mut (361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 
alTlC 361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 

Consensus (361) ACCCCCCTGTGCGTGACCCTGAACTGCACCGACAAGCTGA 

gpl 60 (401) CCGGCAGCACCAACGGCACCAACAGCACCAGCGGCACCAA 

„i fi n del VI (401) GCGCCGGC 

gpl 60 del V2 (401) CCGGCAGCACCAACGGCACCAACAGCACCAGCGGCACCAA 

gpl 60 del Vl-2 (364) 

gp 160 del 126-194 (385) --^----;~^ CGTCACCMCflGCACCAGCGGC ACCAA 

aD 14C (401) CCGGCAGCACCAACGGCACCAACAGCACCAGCGGCACCAA 

aD 140mu' (401) CCGGCAGCACCAACGGCACCAACAGCACCAGCGGCACCAA 

QD12C (401) CCGGCAGCACCAACGGCACCAACAGCACCAGCGGCACCAA 

Consensus (401) CCGGCAGCACCAACGGCACCAACAGCACCAGCGGCACCAA 

4 41 4 80 

gpl 60 (441) CAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACC 

gpl60 del V2 (441) CAGCACCAGCGGC^ 

gpl60 del Vl-2 (364) 

gp 160 del 128-194 (385) ~™-~;~""; c ™cAGCACCAGCACCAACAGCACC 
ao 1 4 0 (441) CAGCACCAGCGGCACCAACAGCACCAGC ACCAACAGCACC 
aD 14 0mut (441) CAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACC 
aol 20 4 41 ) CAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACC 
Consensus (44 1) CAGCACCAGCGGCACCAACAGCACCAGCACCAACAGCACC 

4 81 

aol 6" (481) GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACT 

.,. , g ? J- 409 GGCGAGATCAAGAACT 

gpl60 del V2 (481) GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACT 

gpl60 del Vl-2 (364) " — 

gp 160 del 128-194 (385) -------^~~^ GCCCGAGGGCGAG a T CAAGAACT 

qd140 (481) GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACT 
aD 140mut (481) GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACT 
OD120 (481) GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACT 
Consensus (481) GACAGCTGGGAGAAGATGCCCGAGGGCGAGATCAAGAACT 

521 560 
qpl60 (521) GCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 
aol60 del Vi (521) GCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 

gpl60 del V2 (521) GCAGCTTCAACATCGGCGCCGGC —— 

gpl60 del Vl-2 (521) 

gp 160 del 128-194 (521 ) -~g^ TC ^ CATCACCACCAGCGTGCGCGACAAGGTGCA 
QD140 (521 ) GCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 
aD 140mut (521) GCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 
aol20 (521) GCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 
(521) GCAGCTTCAACATCACCACCAGCGTGCGCGACAAGGTGCA 



Consensus 
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(561) GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
(465) GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 

(544) "~ ~ Z Z _ 

(364) ___ 

(IOC) - * 

561 GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
561 GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
561 GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
56 GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 
( 561 ) GAAGGAGTACAGCCTGTTCTACAAGCTGGACGTGGTGCCC 

(601) ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 
505 ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 
gjjj CGCCTGATCAACTGCA 

{ \tl\ "- _ — AACTGCG 

J 60 1 ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 

601 ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 

60 ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 

601 ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 

601) ATCGACAACGACAACGCCAGCTACCGCCTGATCAACTGCA 

641 

(641) ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 
54 5 ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 
560 ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 

CAGGCCTGCCCCAAGGTGAGCTT 

392 AGACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 
641 ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 
6 ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 
Si ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 
641 ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 

\ 64 1 ) ACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTT 

681 

I 6 R n CGAGCCCA TCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
aMS^TCCCCATCCACTACTaacaCGCC^TC 
600 CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC. 

3 ? CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 

4 32 CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
681 CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
6 1 CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
6 1 CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
68 CGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTC 
681 CGAGCCCATCCCCATCC ACTACTGCGCCCCCGCCGGCTTC 

721 6 
(721) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(625) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(640) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(427) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(472) GCCATCCTGAAGTGCAAGGAC AAGAAGTTCAACGGCACCG 

(721) GCCATCCTGAAGTGCAAGGAC AAGAAGTTCAACGGCACCG 

(721) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(721) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(721) GCCATCCTGAAGTGCAAGGACAAGAAGTTCAACGGCACCG 

(721) GCCATCCTGAAGTGCAAGGAC AAGAAGTTCAACGGCACCG 
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761 800 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
GCCCCTGCAAGAACGTGAGCACCGTGCAGTGCACCCACGG 
801 840 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
CATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 
841 880 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
AGCCTGGCCGAGGAGGAGATCGTGCTGCGCTCCGAGAACT 
881 920 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGA 
921 960 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 
GTCCGTGGAGATCAACTGCATCCGCCCCAACAACAACACG 



FIG. 66B-5 

SUBSTITUTE SHEET (RULE 26) 



WO 00/39302 



PCT/US99/31245 



gpl60 
gpl60 del VI 
gpl60 del V2 
gpl60 del Vl-2 
160 del 128-194 
gpl40TM 
gpl4 0 
gpl4 0mut 
gpl20 
Consensus 

gpl60 
gpl60 del VI 
gpl60 del V2 
gpl60 del Vl-2 
160 del 128-194 
gpl40TM 
gpl40 
gpl40mut 
gpl20 
Consensus 

gpl60 
gpl60 del VI 
gpl60 del V2 
gpl60 del Vl-2 
160 del 128-194 
gpl40TM 
gpl40 
gpl 40mut 
gpl20 
Consensus 

gpl60 ( 
gpl60 del VI 

gpl60 del V2 ( 
gpl60 del Vl-2 
160 del 128-194 

gpl40TM ( 

gp!40 ( 

gpl4 0mut ( 

gpl20 ( 

Consensus ( 

gpl60 i 

gpl60 del VI ( 

gpl60 del V2 ( 
gpl60 del Vl-2 
160 del 128-194 

gp!40TM i 

gpl40 ( 

gpl4 0muc i 

gpl20 ! 

Consensus < 



(961 
(961 

11001 
(905 
(920 
(707 



102/131 

961 1000 
(961) CGT AAGAGC ATCCACATCGGCCCCGGCCGCGCCTTCT ACG 
(865) CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 
(880) CGT AAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT ACG 
(667) CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 
(712} CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCT ACG 
(961) CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 
(961) CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 
(961) CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 
CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 
CGTAAGAGCATCCACATCGGCCCCGGCCGCGCCTTCTACG 

1001 1040 
CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

< / u / j CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

(752) CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

1001) CC ACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

1001) CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

1001) CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

-•001) CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

1001) CCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

1041 1080 
1041) CAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
(945) CAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
( 960) CAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
(747) CAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
(792) CAACATCAGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
1041) CAACATCAGCAAGGCC AACTGGACCAAC ACCCTCGAGCAG 
104 1) C AACATC AGC AAGGCCAACTGGACCAACACCCTCGAGCAG 
104 1) CAACATCAGCAAGGCC AACTGGACCAAC ACCCTCGAGCAG 
104 1) C AACATC AGCAAGGCCAACTGGACCAACACCCTCGAGCAG 
104 1) CAACATC AGC AAGGCCAACTGGACCAACACCCTCGAGCAG 

1081 1120 
1081) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
(985) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
1000) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
(787) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
(832) ATCGTGG AGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
'1081) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
1 1 08 1 ) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
(1081) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
(1081) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA ■ 
1081) ATCGTGGAGAAGCTGCGCGAGCAGTTCGGCAACAACAAGA 
1121 

1121) CCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGAT 
1025) CCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGAT 
104 0) CCATCATCTTC AACAGC AGCAGCGGCGGCGACCCCGAGAT 
(827) CCATCATCTTC AACAGCAGCAGCGGCGGCGACCCCGAGAT 
(872) CCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGAT 
1121) CCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGAT 
1121) CCATCATCTTC AACAGCAGCAGCGGCGGCGACCCCGAGAT 
1121) CCATCATCTTC AACAGCAGCAGCGGCGGCGACCCCGAGAT 
1121) CCATCATCTTCAACAGCAGCAGCGGCGGCGACCCCGAGAT 
1121) CCATCATCTTCAACAGC AGCAGCGGCGGCGACCCCGAGAT 
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1200 

SS CGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTT^ 

u£ CGTGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTAC 
} 1 1 1 rr TGTTCCACAGCTTCAACTGCGGCGGCGAGTTCTTCTAC 
\\t\ \ ^SS^TTC^TaGGCGGCGIUSTTCTTCT^ 

1201) TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCA 
2 TGCA^CACCAGCCAGCTGTTCAACAGCACCTGGAACATCA 
1120 TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCA 

90? Jgc^Sccagccagctgttcmcagcacctggaacatca 

952 tgcaacaccagccagctgttcaacagcacctggaacatca 

201 ^gcaacaccagccagctgttcaacagcacctggaacatca 

\ 20 \ ?gca!caccagccagctgttcaacagcacctggaacatca 

120 TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCA 
TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCA 
1201 TGCAACACCAGCCAGCTGTTCAACAGCACCTGGAACATCA 

1241 

1241) CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
1 1 4 5 CCGAGG AGGTGAAC AAGACCAAGGAGAACGACACCATCAT 
60 CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
947 CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
992 CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
I2T1) CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
1241 CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
J 24 1 CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
12 1 CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 
1241) CCGAGGAGGTGAACAAGACCAAGGAGAACGACACCATCAT 

1281) CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

185 CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

\ 2 00) CCTGCCCTGCCGCATCCGCCAGATCATCAACATGTGGCAG 

illi) CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

032 CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

281 CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

2 CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

2 CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

281 CCTGCCCTGCCGCATCCGCCAGATC ATCAACATGTGGCAG 

III 5 > cSgccSgccgcatccgccagatcatcaacatgtggc^ 

1321) gaggtgggcaaggccatgtacgccccccccatccgcggcc 

225) gaggtgggcaaggccatgtacgccccccccatccgcggcc 

240 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

027 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

1072 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

321 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

1321 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

32 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

321 gaggtgggcaaggccatgtacgccccccccatccgcggcc 

321 gaggtgggcaaggccatgtacgccccccccatccgcggcc 
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1361 1400 



m 1 60 (1361) AGATCMGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

gpl.0 d 1 J! Willi ^^^^^^^^^^^^^t^qt^^^cc^^^ 

9P , n Hb1 v? (1280) AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

?5i Lf il-2 067 AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

15 52l WB- 94 U12 AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

16 ™ <0?M 361 AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 
9 all™ 361 AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

gpl!C U361, AGATCAAGTGCAGCAGCAATATTACCGGCCTG^ 

OD120 (1361) AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

Consents U361) AGATCAAGTGCAGCAGCAATATTACCGGCCTGCTGCTGAC 

QD160 (1401) CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

CO160 dS V? 305) CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

nnleS deJ V2 1320 CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

Ll Vl-2 107 CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

■ S Si 128- 94 52 CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

16 Jp!«™ 01 CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 



160 del 

(1401) CCGCGAL\ibUfcb<-AO\^w-«m.nn<-v-v.w>*.. ---- 

Jo 140 14 01) CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

aplJomu? 1401) CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

9P qpl20 14 01) CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

Consensus (1401) CCGCGACGGCGGCACCAACAACAACCGCACCAACGACACC 

aD 160 (14 4 1 ) GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

no 160 dS VI 345 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

aol60 Tel V2 360 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

?S Lf il-2 147 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

iS del 128- 94 92 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

qP ? 40 ?M 41 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

KluO 441 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

aD i?Siut 1 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

^ ap^O 441 GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

Consensus U44 1) GAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACT 

qdI 60 (1481) GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

™160delVl 1385) GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

Zllo del V 400 GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

?60 de? Jl-2 87 GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

iS del 12B- 9 "2 GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

3 1 qP 40?1 81 GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

anl40 481 GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

apl3omut 1481) GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

9P1 °? 1481 GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

Consensus (1481) GGCGCAGCGAGCTGTACAAGTACAAGGTGGTGCGCATCGA 

1521 

.., go (1521) GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

ifid del VI 14 25) GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

9P fin 2!} V2 40 GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

ap?60 deJ VI 2 15" GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

, 1 ' B 9 12 72 GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

P 1 521 GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

all AO 1521 ) GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

coi?0n.ut 1521) GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

qd120 521 GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 

Consensus 1521) GCCCCTGGGCGTGGCCCCCACCCAGGCCAAGCGCCGCGTG 
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1561 
1465 
1480 
1267 
1312 
1561 
1561 
1561 
1561 
1561 

1601 
1505 
1520 
1307 
1352 
1601 
1601 
1601 
1583 
1601 

1640 
1544 
1559 
1346 
1391 
1640 
1640 
1640 
1600 
1641 

1680 
1584 
1599 
1386 
1431 
1680 
1680 
1680 
1600 
1681 

1720 
1624 
1639 
1426 
1471 
1720 
1720 
1720 
1600 
1721 



1561 1600 
GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGAGCGCCGTGGGCCTGGGCGCCCTGT 

GTGCAGCGCGAGAAGCGCTAAG 

GTGCAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCCTGT 

1601 1640 
TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

TCATCGGCTTC-CTGGGCGCCGCCGGGAGCACCATGGGCG 

ATATCGGATCCTCTAGA 

TCATCGGCTTCNCTGGGCGCCGCCGGGAGCACCATGGGCG 

1641 1680 
CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

CCGCCTCCGTGACCCTGACCGTGCAGGCCCGCCAGCTGCT 

1681 1720 
GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

GAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCC 

1721 1760 
ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

ATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 
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mi 1800 

qpl 60 (17 60) GCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

aD 160 def VI 664 GCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

no 60 del V2 679 GCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

no?60 de? VI 2 466 GCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

, S 11 1 1 2 8 - 94 1511 GCATCAAGCAGCTGCAGGCCCGC ATCCTGGCCGTGGAGCG 

gp 160 del 128 194 511 AGCTGCAGGCCCGCATCC TGGCCGTGGAGCG 

qP QP U0 760 GCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

gplSS.lt (1760) GCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCG 

ConsSs'us \\llV) GC^^^ 

, en (1800) CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

i fi0 dSv? i?S! Sacctgaaggaccagcagctgctgggcatctggggctgc 

9P \ll de V 7?9 CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

So def tl-2 506 CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

^0^1128-94 551 CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

1 in 4 0?M 800) CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

oo?Io 800 CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

g P 14oi« !l800) CTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 

Consents UWn CT^TC^C^^TCCT^TCTGMGCTGC 

aol60 (1840) AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

aD 160 d!l 5? 744 AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

lll fo de V2 759 AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

JgI def tl-2 546 AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

Si 128- 9 1591 AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

gp 160 ael 12 194 91 GATCTGCACCACCAC CGTGCCCTGGAACA 

9P qpl 7o 0 AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

gP l3oi« <184ol AGCGGCAAGCTGATCTGCACCACCACCGTGCCCTGGAACA 

Consents l^GC^ 

OO160 (1880) GCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAA 

cnl60 de! V? 784 GCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAA 

9 n \To del 1799 GCAGCTGGAGCAACAAGAGCC7GACCGAGATCTGGGACAA 

So Ll Vl-2 i 586 GCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAA 

iS Si ^6-^94 631 GCAGCTGGAGCAACAAGAGCC7GACCGAGATCTGGGACAA 

gp 160 del U 194 6.1 GL CAAGAGCC?GACCGAGAT CTGGGACAA 

^gp^O 880 GCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAA 

gplJoiut (1880) GCAGCTGGAGCAACAAGAGCCTGACCGAGATCTGGGACAA 

consensus ilw?! GCAGCTGGAGCAACA^ 

opieo (1920) cItgacctggatggagtgggagcgcgagatcggcaactac 

ifin h ! v? 824 catgacctggatggagtgggagcgcgagatcggcaactac 

™ S de V2 839 CATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTAC 

?S L? t 1-2 626 CATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTAC 

!'0 del J 28- 9 6 1 CATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTAC 

gP lD 1 920) CATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTAC 

^ap^O 920 CATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTAC 

gplJomlt (1920) CATGACCTGGATGGAGTGGGAGCGCGAGATCGGCAACTAC 

Consensus Will] CATCACCTGGMGGACTGGGAGCGCGAGATCGGCAACTAC 
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1961 2000 
nn 1 60 (1960) ACCGGCCTGATCT AC AACCTGATCG AGATCGCCCAGAACC 
,«n H 5? ACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 
W 5SS dll W 879 ACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 
9P .l I f ii 9 666 ACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 
?S l !°, d ?« m5 ACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 



li0 Willi ACCGGCCTGATCTACAACCTGA^ 

9P ao?IS 960 ACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 

gpl!S," U960) ACCGGCCTGATCTACAACCTGATCGAGATCGCCCAGAACC 

Consensus ^G^TC^CCTai^TCGCCCAGAACC 

2001 

nnlGO (2000) AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

,«nd!l5? 904 AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

9P fo de W 919 AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

?fin If il-2 H06 AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

i!S h 1 128- 9 751 AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

1 in?«?i 2000) AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

9P aluO 2S0O AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

gplJSil! \I00V) AGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAA 

Consensus SmS?! ^^^^^^^^^^^^^^'^^^^^''''^^OSO 

nn!60 (204 0) GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 

,, n , f f 94 ; GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 

gP fin V2 1959 GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 

?Si 2.? il-2 746 GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 

llS del m-lM U191) GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 

) 

gp 140mut (204ol GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 



IIVaIVh \2IV0) GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 
9P gp5™ (204?) GTGGGCCAGCCTGTGGAACTGGTTCGACATCACCAACTGG 



Consensus l^n GTGGGCCAGCCTG^GGMCTGGTTCGACATCACCAACTGG 

208 1 

nnl60 (2080) CTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGGCC 

1*0 dS 5? U984 CTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGGCC 
9P fiO de I2 999 CTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGGCC 

?S 2.? il-2 1?B6 CTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGGCC 

IE S , 94 1831 CTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGGCC 

3 160 ^ lluOTM Will] CTGTGGTACATCCGCATCTTCATCATGATCGTGGGCGGCC 

qpl40 (2080) CTGTGGTACATC " _ _ 

g P 140mut (2080) CTGTGGTACATC — - """" 

Consensus Uoen CTGTG^T^CATC^G^A^CTT^A^CATGATCGTGGGCGG^C 

anl60 (2120) TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCA— - 

ool60 d 2024 TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCA- — 

9P fin t\ 11 2039 TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCA 
?fi0 del il-2 826 TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCA 
i fin dpi 128-194 871 TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCA---- 

P gpl40?M (2120) TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGAGCATCGT 

*gpl40 (2092 



gpl40mut (2092) 
gpl20 (1600) 
Consensus (2121) TGA 
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2161 2200 
(2156) - TCGTGAACCGCGTGCGCCAGGGCT ACAGCCCCATCAGCC 
2060 ) - TCGTGAACCGCGTGCGCCAGGGCT ACAGCCCCATCAGCC 
(2075) -TCGTGAACCGCGTGCGCCAGGGCT ACAGCCCCATCAGCC 
(1862) -TCGTGAACCGCGTGCGCCAGGGCTACAGCCCCATCAGCC 
< 1 907 ) -TCGTGAACCGCGTGCGCCAGGGCT ACAGCCCCATCAGCC 

(2160) GTAAGATATCGGATCCTCTAGA " — 

(-2092) -TAAGATATCGGATCCTCTAGA 

(2092) -TAAGATATCGGATCCTCTAGA 

! 2161 ! NTCGTGA^CCGCGTGCGCCAGGGCTACAGCCCCATCAGCC 

(2195) TGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCC 

2099 TGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCC 

2114) TGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCC 

1901) TGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCC 

(194 6) TGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCC 

(2182) ~_ __ - 

(2113) — _ I 

(2113) ZZZZZZ 

1 2201 1 TGCAGACCCGCCTGCCCGCCCAGCGCGGCCCCGACCGCCC 

2241 2280 
(2235) CGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 
(2139) CGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 
(2154) CGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 
(1941) CGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 
(1986) CGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 

(2182) '_ '_'_'_ 

(2113) ~~~ '_' 

(2113) ZZZZ 

(2241) CGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGAC 

(2275) CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCT 

217 9) CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCT 

2194) CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCT 

1981) CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCT 

,2026) CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCT 

(2182) '__ ~_ _____ 

(2113) " """" 

(2113) Z Z-ZZZZI 

1228?! CGCAGCAACCGCCTGGTGCACGGCCTGCTGGCCCTGATCT 

(2315) GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCG 

(2219) GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCG 

(2234) GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCG 

(2021) GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCG 

(2066) GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCG 

(2182) ~ __ 

(2113) " 

(2113) ~ — ~ ZZZZZZZZZZZZ- 

(2321) GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCG 
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i fin '2355) CCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 

,«nd!l 55 I 59 CCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 

9P 60 de W2 2274 CCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 

?S 2 T Ji 2 2061 CCTGCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 

lS del 128 194 (2?56) SgCGCGACCTGCTGCTGATCGTGGCCCGCATCGTGGAG 

9P gpHOTM (2182) " "~ 

gpl40 (2113) ~~~ ~~~~~~ 

gpl40mut (2113) 

Consensus CCTGCGCGACCTGCTGC^G^C^GGCCCGCATCGTGGAG 

nD 160 (2395) CTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGT 
^ 5? 2299 CTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGT 

gP ,n de v 3? Sgctgggccgccgcggctgggaggccctgaagtactggt 
?S 2fil 2 210 ctgctgggccgccgcggctgggaggccctgaagtactggt 

aD llS del "w-l* In!!! CTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGT 

gP 140TK (2182) 

gpl40 (2113) ~ "."I 

gp!40mut (2113) 

Consensus WTou ^^^^^^^^^^^^^^^^^^'^^2 480 

nnl 6 n' 124 35) GGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAG 

1M d 2339 GGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAG 

^ 60 de 2354 GGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAG 

?«1 1 2141 GGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAG 

qp !60 del !28-194 ,2186! GGAACCTGCTGCAGTACTGGAGCCAGGAGCTGAAGAGCAG 

gpl40TM (21B2) ~ 

gpl40 (2113) 

gpl40mut (2113) ~~~~ ~_ 

Consensus GgIaC^GCTGCA^AC^GGAGCCAGGAGCTGA^GAGCAG 

,, n ,0475, CGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 

160 del < ,379 CGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 

9P <n t\ V~2 '394 CGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 

?S 2 f in ' 218 CGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 

gp xlfSi^sT* I'*", CGCCGTGAGCCTGTTCAACGCCACCGCCATCGCCGTGGCC 

gpHOTM (2182) ~~~ 

gpl4G (2113) ~ ~ 

gpHOmut (2113) _ 

Consensus [ull] CGCC^GAGCC^cIaC^ 

nm 60 (2515) GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCT 

lfi n h V? 24 9 GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCT 

^ 60 dl i 2 34 GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCT 

?fin 1? il 2 222 GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCT 

a p il? deW28 19^ !"«! GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCT 

gpl40TM (2182) " " _. 

gpl40 (2113) ~ 

gpHOmut (2113) ~ ~ 

Consensus llSl GAGGGCACCGACCGCATCATCGAGATCGTGCAGCGCATCT 
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2561 2600 

□D160 (2555) TCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGG 

nnl60 Vi (2459) TCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGG 

11 160 del V2 (2474) TCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGG 

aJlll del Vl-2 (2261) TCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGG 

160 del 128-194 (2306) TCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGG 

gpl40TM (2182) ~ " ""_ 

gpl40 (2113) " 

gpl40mut (2113) " 

Consensu^ (2561) TCCGCGCCGTGATCCACATCCCCCGCCGCATCCGCCAGGG 

2601 2640 

odI 60 ( 25 95 ) CCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

nnlfi0 v (2499} CCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

™?60 del V2 2514) CCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

anU>0 del Vl-2 2301) CCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

op Jo del 128-194 (2346) CCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

gp!40TM (2182) _ 

gpl40 (2113) ~~~~~~~ ~~~ ~ ~ 

gpHOmut (2113) 

ConseSsuS ( <26<u! CCTGGAGCGCGCCCTGCTGTAAGATATCGGATCCTCTAGA 

2641 2680 

g pl60 (2635) AAGCCATGGATATCGGATCCACTACGCGTTAGAGCTCGCT 

g P 160 del VI (2539) 

gpl60 del V2 (2554) _ ~ 

gpl60 del Vl-2 (234 1) " ~ "~ I 

gp 160 del 128-194 (2386) """"ZZ 

gpHOTM (2182) ~~ ~* 

' gpl40 (2113) ZZZZ ZZZ_Z_Z 

gp!40mut (2113) _~_Z— Z_Z 

Consensus (264 1) NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 

2681 

gpl60 (2675) GATCAGCT 

gP 160 del VI (2539) 

gpl60 del V2 (2554) 

gpl60 del Vl-2 (2341) 

gp 160 del 128-194 (2386) 

gp!40TM (2182) 

gpl40 (2113) 

gpl40mut (2113) 

gp!20 (1600) 

Consensus (2661) NNNNNNNN 
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HIV-1SF2 wt RT (PISPIET-->GIRKVL) 



n^^arraATGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAGATATGTACAGAA 
CAAAGTAGCATGACAAAAATC., GATCTGACT TAGAAATAGGGCAGCATAGAACA 




GTGGGA^AATTG^ATTGGGCAAGTCAGATTTATGCAGGGATTAAAGTAAAGCAGT^ 

arr R.AAGCACTAACAGAAGTAATACCACTAACAGAAGAAGCAGAG 



™n * a rT-rr AGAAAACAGGGAGATTCTAAAAGAACCAGTALAi b^b i ai/h x« - 
C T ^^rr^?^AGCAGAAATACAGAAGCAGGGGCAAGGCCAATG 

rACACTA^TGATGTAAAACA^^AACAGAGGCAGTGCA 

CACACTAAT ^™^rr?AJ^ T^TAAACT ACCCATACAAAAGGAAACATGGGAAGCA 
TATGTAGATGGGoCAGCTA^ 

^I?^^?^tStaaaaaaggaaaaggtctacctgg 
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GagProtMod. SF2 (GPl) 

GTCGACGCCACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGG 

GAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGG 

GCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGC 

TGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGC 

AGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGAC 

ACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAG 

CAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATC 

GTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCC 

TGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCC 

CTGAGCGAGGGCGCCACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCAC 

CAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGC 

GTGCACCCCG7GCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGC 

GACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCC 

CCCA^C-CCGTGGGCGAGATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTG 

CGGATGTACAGCCCCACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGC 

GACTACGTGGACCGCTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAG 

AACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTG 

AAGGCTCTCGGCCCCGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGC 

GGCCCCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCG 

ACCATCATGATGCAGCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAAC 

TGCGGCAAGGAGGGCCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGG 

CGCTGCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTA 

GGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAG 

CCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAG 

AAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGC 

AACGACCCCTCGTCACAGTAAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCG 

GCGCCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGA 

TCGGCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCT 

GCGGCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCC 

GCAACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGG 

TGCCCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGTAAG 
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GagPro tMod. SF2 (GP2) 

GTCGACGCCACCATGGGCGCCCGCGCCAGCGTGCTGAGCGGCGGCGAGCTGGACAAGTGG 

GAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGCTGAAGCACATCGTGTGG 

GCCAGCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCAGCGAGGGC 

TGCCGCCAGATCCTGGGCCAGCTGCAGCCCAGCCTGCAGACCGGCAGCGAGGAGCTGCGC 

AGCCTGTACAACACCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGACGTCAAGGAC 

ACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACAAGTCCAAGAAGAAGGCCCAG 

CAGGCCGCCGCCGCCGCCGGCACCGGCAACAGCAGCCAGGTGAGCCAGAACTACCCCATC 

GTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCC 

TGGGTGAAGGTGGTGGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCAGCGCC 

CTGAGCGAGGGCGCCACCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCAC 

CAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGC 

GTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGC 

GACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGGCTGGATGACCAACAACCCC 

CCCATCCCCGTGGGCGAGATCTACAAGCGGTGGA7CATCCTGGGCCTGAACAAGATCGTG 

CGGATGTACAGCCCCACCAGCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGC 

GACTACGTGGACCGCTTCTACAAGACCCTGCGCGCTGAGCAGGCCAGCCAGGACGTGAAG 

AACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTG 

AAGGCTCTCGGCCCCGCGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGC 

GGCCCCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGTGACGAACCCGGCG 

ACCATCATGATGCAGCGCGGCAACTTCCGCAACCAGCGGAAGACCGTCAAGTGCTTCAAC 

TGCGGCAAGGAGGGCCACACCGCCAGGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGG 

CGCTGCGGCCGCGAAGGACACCAAA.TGAAAGATTGCACTGAGAGACAGGCTAATTTTTTA 

GGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAG 

CCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAG 

AAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGC 

AACGACCCCTCGTCACAGTAAGGATCGGGGGGCAACTCAAGGAAGCGCTGCTCGATACAG 

GAGCAGATGATACAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGA 

TAGGGGGGATCGGGGGCTTCATCAAGGTGAGGCAGTACGACCAGATACCTGTAGAAATCT 

GTGGACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAA 

GAAATCTGTTGACCCAGATCGGCTGCACCTTGAACTTCCCCATCAGCCCTATTGAGACGG 

TGCCCGTGAAGTTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAATGGCCA7TGTAAG 



AATTC 



FIG. 70 

(SEO ID NO:79) 



WO 00/39302 



PCT/US99/31245 



115/131 

FS(+)_ProtInact_RTopt_YM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTTTTAGGGA 
AGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAA 
CAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAGGAGAAAACAACTCCCTCTCAGAAGC 
AGGAGCCGATAGACAAGGAACTGTATCCTTTAACTTCCCTCAGATCACTCTTTGGCAACG 
ACCCCTCGTCACAATAAGGATCGGGGGGCAACTCAAGGAAGCGCTGCTCGATACAGGAGC 
AGATGATACAGTATTAGAAGAAATGAATTTGCCAGGAAAATGGAAACCAAAAATGATAGG 
GGGGATCGGGGGCTTCATCAAGGTGAGGCAGTACGACCAGATACCTGTAGAAATCTGTGG 
ACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAA 
TCTGTTGACCCAGATCGGCTGCACCTTGAACTTCCCCATCAGCCCTATTGAGACGGTGCC 
CGTGAAGTTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAATGGCCATTGACCGAGGA 
GAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCAA 
GATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCAC 
CAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGA 
GGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCT 
GGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCGC 
.CTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCT 
GCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGGA 
GCCCTTCCGCAAGCAGAACGCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAG 
CGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGCG 
CTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGG 
CTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACAG 
CTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTA 
CGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCGA 
GGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 
GGAGCCCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAA 
GCAGGGCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGAC 
CGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGC 
CGTGCAGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGCT 
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CCAGAAGGAGACCTC-GGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGATCCCCGA 

GTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTC-TGGTACCAGCTGGAGAAGGAGCC 

CATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGCTGGG 

CAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCGACACCAC 

CAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCCTGGAGGT 

GAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAG 

CGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCT 

GGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGAG 

CGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGATCTACCA 

GTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGG 

GGCTAGCACCGGTGAATTC 

FIG. 72B 

(SEOID NO:81) 



STTRSTTTUTF SHEET (RULE 26) 



WO 00/39302 



PCT/US99/31245 



119/131 



FS(-)_ProtMod_RTopt_YM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTCTTCCGCG 

AGGACCTGGCCTTCCTC-CAGGGCAAGGCCCGCGAGTTCAGCAGCGAGCAGACCCGCGCCA 

ACAGCCCCACCCGCCGCGAGCTGCAGGTGTGGGGCGGCGAGAACAACAGCCTGAGCGAGG 

CCGGCGCCGACCGCCAGGGCACCGTGAGCTTCAACTTCCCCCAGATCACCCTGTGGCAGC 

GCCCCCTGGTGACCATCAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCGGCG 

CCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCG 

GCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCTGCG 

GCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCCGCA 

ACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGGTGC 

CCC-TGAAGCTGAAGCCGGGGATGGACGGCCCCAAGG7CAAGCAGTGGCCCCTGACCGAGG 

AGAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCA 

AGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCA 

CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 

AGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGC 

7GC3ACGTGGGCC3ACGCCTAC7TCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCG 

CCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGC 

TGCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGG 

AGCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCA 

GCGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGC 

GCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGG 

GCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACA 

GCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCT 

ACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCG 

AGGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGA 

AGGAGCCCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGA 

AGCAGGGCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGA 

CCGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGG 

CCGTGCAGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGC 

FIG. 73A 

(SEO ID NO:82) 



CTTHQTITIITF StJFFT ntiu.F 761 



WO 00/39302 PCTVUS99/31245 

■ 120/131 



TGCCCATCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGA 

TCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGA 

AGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCA 

AGCTGGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCG 

ACACCACCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCC 

TGGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCG 

ACAAGAGCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGG 

TGTACCTGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGC 

TGGTGAGCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGA 

TCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGC 

TTCCCGGGGCTAGCACCGGTGAATTC 

FIG. 73B 

(SEO ID NO:82) 



STIKSTTTTJTE SHFET (RULE 26) 



WO 00/39302 PCTAJS99/3 1 245 

121/131 

FS(-) ProtMod_RTopt_YMWM 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTCTTCCGCG 

AGGACCTGGCCTTCCTGCAGGGCAAGGCCCGCGAGTTCAGCAGCGAGCAGACCCGCGCCA 

ACAGCCCCACCCGCCGCGAGCTGCAGGTGTGGGGCGGCGAGAACAACAGCCTGAGCGAGG 

CCGGCGCCGACCGCCAGGGCACCGTGAGCTTCAACTTCCCCCAGATCACCCTGTGGCAGC 

GCCCCCTGGTGACCATCAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCGGCG 

CCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCG 

GCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCTGCG 

GCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCCGCA 

ACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGGTGC 

CCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGACCGAGG 

AGAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCA 

AGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCA 

CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 

AGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGC 

TGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCG 

CCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGC 

TGCCC-AGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGG 

AGCCC-CCGCAAGCAGAACCCCGACATCGTGATCTACCAGGCCCCCCTGTACGTGGGCA 

GCGACC-GGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACCTGCTGC 

GCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCC7TCCTGCCCATCG 

AGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGAAGGACAGCTGGA 

CCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACGCCG 

GCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCCTGACCGAGGTGA 

TCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAAGGAGC 

CCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGG 

GCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCA 

AGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGC 

AGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGTTCAAGCTGCCCA 

FIG. 74A 

(SEO ID NO:83) 



VTIRSTITTITF KffFFT (RULE 26) 



WO 00/39302 PCT/US99/31245 

122 / 131 



TCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGATCCCCG 
AGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGC 
CCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGCTGG 
GCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCATCGCCGACACCA 
CCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACAGCGGCCTGGAGG 
TGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGA 
GCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACC 
TGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGA 
GCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCATCGTGATCTACC 
AGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCG 
GGGCTAGCACCGGTGAATTC 

FIG. 74B 

(SEO ID NO:83) 



SIJRSTJTIITF. SHEET /MILE 26) 



WO 00/39302 



PCT/US99/31245 



123/131 



FS(-)_ProtMod_RTopt(+) 

GCGGCCGCGAAGGACACCAAATGAAAGATTGCACTGAGAGACAGGCTAATTTCTTCCGCG 

AGGACCTGGCCTTCCTGCAGGGCAAGGCCCGCGAGTTCAGCAGCGAGCAGACCCGCGCCA 

ACAGCCCCACCCGCCGCGAGCTGCAGGTGTGGGGCGGCGAGAACAACAGCCTGAGCGAGG 

CCGGCGCCGACCGCCAGGGCACCGTGAGCTTCAACTTCCCCCAGATCACCCTGTGGCAGC 

GCCCCCTGGTGACCATCAGGATCGGCGGCCAGCTCAAGGAGGCGCTGCTCGACACCGGCG 

CCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCG 

GCGGGATCGGGGGCTTCATCAAGGTGCGGCAGTACGACCAGATCCCCGTGGAGATCTGCG 

GCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCATCGGCCGCA 

ACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACGGTGC 

CCGTGAAGCTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAGTGGCCCCTGACCGAGG 

AGAAGATCAAGGCCCTGGTGGAGATCTGCACCGAGATGGAGAAGGAGGGCAAGATCAGCA 

AGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCA 

CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 

AGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGC 

TGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACAAGGACTTCCGCAAGTACACCG 

CCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGC 

TGCCCCAGGGCTGGAAGGGCAGCCCCGCCATCTTCCAGAGCAGCATGACCAAGATCCTGG 

AGCCCTTCCGCAAGCAGAACCCCGACATCGTGATCTACCAGTACATGGACGACCTGTACG 

TGGGCAGCGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGCTGCGCCAGCACC 

TGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGT 

GGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCATGCTGCCCGAGA 

AGGACAGC7GGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCC 

AGATCTACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCGGCACCAAGGCCC 

TGACCGAGGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGA 

TCCTGAAGGAGCCCGTGCACGAGGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGA 

TCCAGAAGCAGGGCCAGGGCCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACC 

TGAAGACCGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACGTGAAGCAGCTGA 

CCGAGGCCGTGCAGAAGGTGAGCACCGAGAGCATCGTGATCTGGGGCAAGATCCCCAAGT 

TCAAGCTGCCCATCCAGAAGGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCA 

CCTGGATCCCCGAGTGGGAGTTCG7GAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGC 

TGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCG 

FIG. 75A 

(SEO ID NO:84) 



XT1RSTTTT1TF SffF.FT fRUT.F.26) 



WO 00/39302 



124/131 



PCT/US99/31245 



AGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGGTGGTGAGCA 
TCGCCGACACCACCAACCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACA 
GCGGCCTGGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCC 
AGCCCGACAAGAGCGAGAGCGAGCTGGTGAGCCAGATCATCGAGCAGCTGATCAAGAAGG 
AGAAGGTGTACCTGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGG 
ACAAGCTGGTGAGCGCCGGCATCCGCAAGGTGCTGTTCCTGAACGGCATCGATGGCGGCA 
TCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATT 
AAAAGCTTCCCGGGGCTAGCACCGGTGAATTC 

FIG. 75B 

(SEO ID NO:84) 



STIRSTTTTrTF SHF.F.T (RULE 26) 



WO 00/39302 



PCT/US99/31245 



125/131 



Tat_wt_SF162 (wildtype) 

ATGGAGCCAGTAGATCCTAGATTAGAGCCCTGGAAGCATCCAGGAAGTCAGCCTAAGA 

CTGCTTGTACAAATTGCTATTGTAAAAAGTGTTGCTTTCATTGCCAAGTTTGT^ 

AAAAGGCTTAGGCATCTCCTATGGCAGGAAGAAGCGGAGACAGCGACGAAGAGCTCCT 

CCAGACAGTGAGGTTCATCAAGTTTCTCTACCAAAGCAACCCGCTTCCCAGCCCCAAGG 

GGACCCGACAGGCCCGAAGGAATCGAAGAAGAAGGTGGAGAGAGAGACAGAGACAGA 

TCCAGTCCATTAG 

FIG. 76 

(SEO ID NO:85) 



Tat_SF162 



MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKGLGISYGRKKRRQRRRAPPDSE 
VHQVSLPKQPASQPQGDPTGPKESKKKVERETETDPVH 

FIG. 77 

(SEO ID NO:86) 



Tat_SF162_opt 



ATGGAGCCCGTGGACCCCCGCCTGGAGCCCTGGAAGCACCCCGGCAGCCAGCCCAAGAC 

CGCCTGCACCAACTGCTACTGCAAGAAGTGCTGCTTCCACTGCCAGGTGTGCTTCATCACC 

AAGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCCGCGCCCCCCC 

CGACAGCGAGGTGCACCAGGTGAGCCTGCCCAAGCAGCCCGCCAGCCAGCCCCAGGGCG 

ACCCCACCGGCCCCAAGGAGAGCAAGAAGAAGGTGGAGCGCGAGACCGAGACCGACCCC 

GTGCACTAG 

FIG. 78 

(SEO ID NO:87) 



Tat_Cys22_SF162_opt 



ATGGAGCCCGTGGACCCCCGCCTGGAGCCCTGGAAGCACCCCGGCAGCCAGCCCAAGAC 

CGCCgGCACCAACTGCTACTGCAAGAAGTGCTGCTTCCACTGCCAGGTGTGCTTCATCACCA 

AGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCCGCGCCCCCCCC 

GACAGCGAGGTGCACCAGGTGAGCCTGCCCAAGCAGCCCGCCAGCCAGCCCCAGGGCGA 

CCCCACCGGCCCCAAGGAGAGCAAGAAGAAGGTGGAGCGCGAGACCGAGACCGACCCCG 

TGCACTAG 

FIG. 79 

(SEO ID NO:88> 



SUBSTITUTE SHEET (RULE 26) 



WO 00/39302 



PCT/US99/31245 



126 /131 



c 

o 

u 

CO 



CL, 

u 



a. 

cj 

to 
> 

O 

s: 

CP 

cj 

c 

0) 
B 
c 



o 



o 
in 



o 
ro 



O 

CM 



ft ft 

CJ U 

cj cj 

cj cj 

u o 

cj cj 

cj cj 

u u 

u u 

u u 

u u 

cj cj 

u u 

cj cj 

H H 

U U 

u u 

CJ CJ 

u u 
cj o 

cj cj 

22 

cj o 
cj cj 
cj cj 

H H 

(J CJ 

22 

u u 

ft ft 

cj cj 

cj cj 

u o 
cj cj 

< ft 
cj cj 
u u 
o o 
cj cj 
cj u 
u cj 
cj cj 
u u 
o cj 

ft ft 

cj cj 

u u 
cj cj 

H H 

(J cj 
cj o 

CJ CJ 

< < 

u o 
u o 
(j cj 
u u 
cj cj 
u u 
u u 
cj u 
cj cj 
u u 
(J cj 
cj cj 
cj cj 



CM 



ft ft 

u u 
cj cj 
cj cj 
u u 
cj cj 
cj cj 
u u 
u u to 
u u 
u u 
CJ cj 
cj u 
cj cj 

u u 

u u 

cj cj 

u u 

u u 

22 

cj cj 
ft ft 
cj cj 

CJ o 
cj cj 

o cj 

22 

U (J 

ft < 

CJ o 

cj a 

u u 

cj cj 

< < 
cj cj 
u u 
cj cj 
o o 
u u 
o u 
o o 
u u 

u u 
e> o 

o o 
u u 

< < 
u u 
u u 
o o 
u u 
o u 
u u 
u u 
u u 
o o 
u u 

o o 
ft ft 



CM ^- H f- 

in o U U 



_ _ _ u 
o o o o c 

H H H O 
U U U U *h 

(J o u u *-» 
o o o o u 

D O O O O 

U (J u u w 

u u u a 

o u u a u 

t (J u u u 

e) o o o 

6- H H H 

o o o o 

u u u u 

u u u o 

o o o o 

u u o o 

O H H h e- 

m H H H H 

o u u u 

o o o o 

u u u u 

e> o o o 

< ft ft ft 

O O ID O 

o o o 

O H H H H 

<M a o u u 
^ o o o 

< <C < 

O (J o o 

u u u u 

o o o o 

o u o u 

u u u u 

O O O ID 
O ft ft < ft 

^ o u u u 

^ (j u a u 

o o o o 

o o o o 

o o o o 

H t- H f- 

o o o a 

H H H H 

(J O O O 

o a u u u 

O H fr- H 

r-i < < < < 

u u u a 

< < rf: ft 
u u o u 
o o o e> 

o o o o 

6- H H E- 

o u u a u 

O O O O 

u u u u 

< < < ft 

E- (r- f- 



00 U 
CM O 
CM U 

u 

< 
o 

< 

o 
u 
o 
ft 

u 

o O 

CM O 

u 

o 
ft 
u 
o 

H 

o u 
o U 

CM (5 

< 
u 
o 
u 
o 
«c 
u 
o 

(J 
o 

u 

iD 

o 

O H 
00 (J 

rH U 

< 

< 

u 
u 
o 
u 
u 
o 

H 

u 
o 
o 

< 

o 
o u 



o 



o 



b b b e> »-< <c 

o o o o ft 

O L9 C O in o 

«: < <c -ho 



u u 

u u 

<t ft 

u> o 

e) u 

ft < 

o o 

U V 

< < 

CJ CJ 

u u 

U CJ 

< < 

O CJ 

< ft 
u u 

CJ CJ 

u u 
u u 

CJ CJ 

ft ft 

L) U 

u u 

U CJ 
CJ CJ 

ft ft 

CJ u 

CJ CJ 

CJ u 
CJ CJ 

ft < 

u u 
u u 

CJ CJ 

CJ CJ 

CJ CJ 

H H 

CJ CJ 

CJ CJ 

ft < 

CJ CJ 

< ft 

u u 

CJ u 
O CJ 

u u 

U CJ 
CJ CJ 

a (J 

CJ <J 

cj o 

(J CJ 

< < 
cj a 

CJ u 
CJ CJ 

ft ft 
u u 

CJ CJ 

ft ft 

CJ CJ 

ft ft 

CJ CJ 
CJ CJ 



CJ *r 
CJ 

CJ c 

CJ o 

CJ 

CJ u 

< o 

CJ 
CJ 

< 

CJ 
CJ 
CJ 

< 

CJ 
CJ 
CJ 
CJ 
CJ 

< 

CJ 

ft 

CJ 
CJ 
H 
CJ 

a 

CJ 

CJ 
CJ 

u 

CJ 

< 

CJ 
CJ 

f- 

o 

CJ 

ft 

CJ 

CJ 

o 

CJ 
CJ 
H 
CJ 
CJ 
H 

< 

CJ 

< 
u 

CJ 
CJ 
CJ 
CJ 

o 



CJ 
CJ 

< 

CJ 
CJ 
CJ 

< 

u 
u 
ft 

CJ 

< 

CJ 
CJ 



o 

00 



CJ CJ 

CJ CJ 

H H 

CJ cj 

CJ u 

u a 

CJ CJ 
CJ CJ 

ft ft 

CJ CJ 
CJ CJ 

_ .% 

G\ U CJ 
CM U U 

ft < 

CJ u 

ft < 

CJ CJ 
CJ CJ 

< 

ft 

u u 

_ H H 
CM CJ cj 
CJ u 

< < 

CJ CJ 

u u 

< <c 

U CJ 

CJ CJ 

CJ CJ 

CM CJ CJ 

ft ft 

a o 

U CJ 

ft ft 

CJ u 

CJ CJ 

H H 

O CJ CJ 

W U CJ 

CM CJ CJ 

H H 

U CJ 

< ft 

f— H 
O CJ 
H H 

u u 

O (J u 
in U U 

CM < < 

CJ u 

CJ CJ 

CJ CJ 

CJ CJ 

H H 

CJ CJ 

CJ (J 

O U CJ 

<£ < 

CM U U 



u 

CO 



(j cj in 

CJ CJ 
H H C 

u u o 

U U -h 

U CJ 

CJ CJ 

CJ (J 

< < 

CJ CJ 
CJ CJ 

u u 
u u 

< < 

CJ u 

< ft 

CJ CJ 
CJ CJ 



O CJ CJ CJ CJ 
CD < ft ft ft 

m cj o cj u 



o 

r- 



U CJ 

ft < 

t— H 

CJ CJ 

H H 

U CJ 

u u 

CM CJ CJ 
CM < ft 



u u 

CJ CJ 

u u 
ft ft 

CJ CJ 

u u 

< < 

U CJ 

CJ CJ 

CJ CJ 

CJ cj 

ft < 

CJ CJ 
U CJ 

< ft 

U CJ 

CJ CJ 

H H 

CJ CJ 

u u 

CJ o 
H H 

U CJ 

ft ft 

CJ (J 
H H 

u u 

u u 

u u 

ft ft 

CJ <J 

U CJ 

CJ CJ 

CJ CJ 

f— E- 

CJ CJ 

CJ u 

CJ CJ 

< ft 

CJ CJ 

u u 
ft ft 

o cj 

u u 
u u 

CJ CJ 

< ft 



o 
r- 
m 



u 
a 
a 
u 
u 
<c 
u 

CJ 

(J 
u 
a 

CJ 

u 

O CJ 
vo CJ 
CO CJ 

u 

CJ 

u 

CJ 

u 

CJ 
O CJ 
in CJ 
co <C 

u 

CJ 

< 

u 

CJ 

u 

CJ 
O CJ 

CO 



u 

(J 



CJ u u 
CJ CJ CJ 
CJ CJ CJ 

u u u 

CJ u 

. ft < 

CJ CJ CJ 
CJ CJ CJ 
CJ CJ CJ 

(J u u 
ouu 

CJ CJ CJ 

u u u 
u u u 

CJ CJ CJ 
CJ (J 
CJ CJ 
CJ CJ CJ 

u cj a 
ouu 

CJ CJ CJ 

u u u 

CJ (J CJ 
CJ CJ CJ 
CJ CJ cj 

< < ft 

CJ CJ u 
CJ CJ CJ 

ft ft 

CJ u 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ CJ 

CJ CJ CJ 



ft 

CJ 



CJ 



u u 
ouu 

n H h 

ro CJ CJ 



u u 

u u 

f— £- 

CJ CJ 



u u u u 



CJ CJ 

ouu 

CM CJ CJ 

ro < <C 

CJ CJ 

(J CJ 

ft ft 

CJ CJ 

CJ CJ 

ft ft 

CJ CJ 

ouu 

t E— 1 f— * 

CJ CJ 

CJ CJ 

< ft 



ro 



in 
o 
ro 



CJ CJ 

< < 

u u 

CJ CJ 

ft ft 

O cj 

CJ o 

ft ft 

U CJ 

CJ cj 

ft < 

CJ CJ 

u u 

< < 

CJ CJ 



CJ CJ 



< 
o 

00 

d 

LL 



r- r- r- 
r- r- r- r~ 



ro ro co co ro 
in in in in lO 



G\ CTk CTN CH <Xv 
CM CM CM CM CM 
fNj CM CM CM CM 



in in in in in 
o o o o o 

ro co co ro co 



CM — 

Uj t-H 

co a. 



T5 
0 
2 



— (0 
CM 3 
Cb CO 



CJ CJ c 

— — a) 

CM (0 



rj» oo 



CJ 



0 
o 

CP 
CJ 



00 

o 

0 
u 

CL 

CP 

TO 
CJ 



c 
o 
u 



0- 
CJ 

CM 

u 



CM 
Lu 
00 

*D 

o 

rj* to 
m • 
CJ T3 
O 

o 

u 
a. 

a) 
U 



— (0 
CM D 
Oi CO 
CJ C 

— a> 

CM W 

u c 

CO o 

• u 
-o 
o 
2: 
.u 
o 

lH 

Ou 

CP 

TO 
O 



CM 
CO 0u 



— (0 
CM ^ 
CO 



O 



CJ CJ c 
CM 10 



CJ 1 00 
(0 - 
CJ TJ 
0 

0 

a- 

CP 
fO 

CJ 



CO 

•u 

o 

0 

u 
CL 

CP 

TO 
CJ 



C 

o 
u 



CM 

oo cu 

■ CJ 
"0 — 

0 CM 
CP 00 

m * 

o 
s: 

4-* 

o 

CL 

CP 

ro 
CJ 



— to 

CM D 

Cu (0 

CJ u 

— Q) 
CM CO 



u 

CO 

"D 

0 

0 

a 

CP 

ro 

CJ 



c 

o 
u 



CM 

U_l f-H 

CO cu 
. CJ 

-o 

O CM 

CP 00 
ro • 
CJ XJ 

o 
s 

o 

CU 
CP 
ro 

CJ 



to 

CM D 

cu to 

CJ c 

— 0) 

CM 10 



u 

00 

T5 
O 

J-J 

o 

Cu 
CP 
fD 
CJ 



c 

o 
u 



CT7BCTTTTTTV CUrTTT /VJfJ F 



WO 00/39302 



PCT/US99/31245 



127 / 131 



VO 
C 

o 

• M 
4-* 

u 

CO 



CM 
Lu 
CJ 



Cu 
CJ 

to 

> 

o 
s: 
cn 

CJ 



c 
£ 

c 

CT 



o 



o 

ro 

«0» 



O 

CN 



o 
o 



cj cj 

cj cj 

(J u 

u cj 

u u 

cj o 

u u 

cj cj 

< ft 
cj u 

< < 

cj u 

CJ u 

cj cj 

cj o 

< < 
cj u 
CJ (J 

< < 
cj cj 
cj cj 

H H 

CJ CJ 

CJ CJ 

< <C 

CJ CJ 

ft ft 

u o 

u cj 

cj cj 

cj cj 

cj cj 

< < 
cj cj 
o cj 

H H 

CJ U 

CJ u 

3 3 5 



y? cj u 

a") o o 

^ u u 

cj cj 

u u 

cj u 

u u 

cj o 

< < 

CJ (J 

< < 
(J u 
u u 
cj cj 
cj cj 

< < 

CJ u 

o u 

ft ft 

u u 

CJ cj 

E- H 

CJ cj 

CJ CJ 

< < 
CJ CJ 

<c < 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

ft ft 

CJ CJ 

CJ CJ 

H H 

U CJ 

CJ CJ 



o 

CO 



O 
CM 



CJ CJ 

ft < 

u u 

CJ CJ 

H H 

CJ CJ 

CJ CJ 

H H 

< ft 
CJ U 
U U 
U U 
U U 

< < 

U U 



5 



CJ CJ 

< < 

u u 
u u 

CJ CJ 

ft < 

ro CJ CJ 

H H 

CJ CJ 

CJ CJ 

< ft 

CJ CJ 

CJ CJ 

» CJ CJ 

CO < < 

ro CJ U 



CJ CJ 

ft < 

CJ CJ 

CJ CJ 

H t- 

CJ CJ 

CJ CJ 

6- r- 

< < 
U CJ 
O CJ 
CJ CJ 
CJ CJ 

< < 

H H 

CJ U 

52 

CJ CJ 

< < 

u u 

U CJ 
CJ CJ 

< < 

CJ CJ 

H f- 

CJ CJ 

CJ CJ 

ft ft 

CJ CJ 
CJ CJ 
CJ CJ 

< ft 

CJ CJ 



CnJ < 

ro CJ 
in H 
CJ 
U 
U 

cj 

CJ 
CJ 

<r 

o 

H 

CJ 

< 
CJ 

u 
u 
u 

O H 

m cj 

H 
CJ 
CJ 

ft 

CJ 
CJ 
CJ 
O CJ 
o CJ 

lO CJ 
et 
(J 
E-* 

U 
CJ 
CJ 
O CJ 

CJ 

< 

CJ 
CJ 

< 

CJ 
CJ 
O H 

CO CJ 
<T O 

E— 
CJ 
CJ 



CJ CJ 

H H 

U U 

U CJ 

CJ CJ 

CJ CJ 

U CJ 

CJ CJ 

< < 

CJ CJ 

E-* f— 

CJ CJ 

< < 

u u 

CJ u 

u o 

U CJ 

H H 

< < 
CJ CJ 
H H 
CJ CJ 
CJ CJ 

< < 

CJ CJ 

CJ CJ 

CJ CJ 

U CJ 

CJ u 

CJ CJ 

ft ft 

CJ CJ 

f-> H 

H H 

CJ U 

u u 

CJ CJ 

CJ CJ 

53 

CJ CJ 

< < 

CJ CJ 

CJ CJ 

< < 

CJ CJ 

CJ CJ 

e- h 

CJ CJ 

CJ CJ 

H f- 

CJ CJ 

CJ CJ 



< 

CJ 
H 

u 
u 

CJ 
CJ 
CJ 



CJ C/5 

< 

CJ 
f— 
H 
CJ 
H 

< 

CJ 

u 

CJ 

u 
< 

CJ 
E- 

CJ 
CJ 

< 

CJ 

o 

CJ 
CJ 
CJ 
CJ 

< 

CJ 
H 

U 
U 
CJ 
CJ 

CJ 

< 

CJ 
CJ 

< 

CJ 
CJ 

f- 

CJ 
CJ 
f— 
CJ 
CJ 



o 



CJ 

CJ 
CJ 
CJ 

CJ 
CJ 

o 
u 



CJ CJ 

H H 

CJ CJ 

CJ CJ 

CJ CJ 

H H 

U U 

U CJ 

CJ CJ 

CJ CJ 



22£ 



CJ 

CJ 
CJ 
CJ 

U 
U 
CJ 
CJ 



CJ 

CJ 
CJ 

in cj 



CJ CJ 

u o 

CJ CJ 

U CJ 

< < 



CJ 

u 

CJ 

CJ 

< 



00 h H 

O U CJ 

VD CJ CJ 

6- H 

«C < 

CJ CJ 

< < 

U CJ 
CJ CJ 

< < 

U CJ 

o u 

CJ CJ 

CJ CJ 

CJ CJ 

O CJ CJ 

cn CJ CJ 

m < < 

CJ CJ 
CJ CJ 

< < 
u o 

U CJ 
CJ CJ 

cj a 

O CJ CJ 
CD CJ CJ 

in cj CJ 

CJ CJ 

H H 

(J CJ 

cj a 
u o 

< < 

CJ CJ 

S3 

u-> cj CJ 
H H 

CJ CJ 

< < 

O CJ 

u a 

o < < 

VD U CJ 

CJ CJ 

H H 

CJ u 

CJ CJ 

< < 

CJ CJ 
O CJ CJ 

m cc < 

uO U U 
CJ u 

u a 

U CJ 
CJ CJ 
CJ CJ 

(t < 

u u 
o u o 

■"^ CJ CJ 

in U U 

CJ CJ 

CJ CJ 

CJ CJ 

< ft 

ro (J CJ 
ro U CJ 
in (J CJ 



c 

o 

-i-4 

u 



o 



H E-» 

u u 

CJ CJ 

< < 

CJ CJ 

< < 
a o 

O CJ CO 

< ft 

U CJ 

u u 

CJ CJ 
CJ CJ 

cj a 

CJ CJ 
CJ CJ 

< < 

CJ (J 
U CJ 

■ft ft 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

cj a 

CJ CJ 

CJ CJ 

CJ CJ 

E- E- 

CJ CJ 

CJ CJ 

cj a 

< < 

CJ CJ 

CJ CJ 

E- H 

H E- 

CJ CJ 

H E- 

< < 

CJ CJ 
CJ CJ 



33 



CJ CJ 

£— t— 

CJ u 

CJ u 

< < 

CJ CJ 

CJ CJ 

< ft 
U CJ 
CJ CJ 
CJ CJ 
CJ CJ 
CJ CJ 
CJ u 

ft ft 

u u 

U CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

< ft 

CJ CJ 

CJ CJ 

CJ CJ 



CO 00 00 00 00 



r- r- p- 

in in in in in 



ro ro ro ro ro 
ro ro ro ro ro 



U. *-* 

co a. 

* CJ 

o 



(N) 

CT CO 
TO 



CJ 



T3 
O 

4J 
O 
M 

Cw 

a> 

CJ 



— </) 
CM 3 

CJ c 

— 0) 
CM I/) 

u c 

co o 

• CJ 

o 
z 

o 

a 

cr 

CJ 



r\) — 

Lu r—t 

CO CL 

• CJ 

o 



CM 

_ u, 
CP to 



CJ 



o 
s 

o 

u 

CL 
CT 
«J 
CJ 



a. to 
CJ c 
— o 

CM W 

u, c 

CO o 
• CJ 

o 
o 

U 

Cu 

CP 

(TJ 
CJ 



CM 

CO Cu Cu 
• CJ CJ 
"D 
O 
2 



CM CM 

o co to 

CJ "O "O 

o o 

o o 

u u 

Ou Cu 

CJ CJ 



c 
cu 

c 
o 

CJ 



^* 




Cj 


CJ 


u 


CO 


CJ 


CJ 


CJ 


CJ 


VD 


CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


o 


CJ 




CJ 


CJ 


u 


(J 




CJ 


CJ 


CJ 


u 




CJ 


CJ 


o 


u 




CJ 


CJ 


u 


u 




CJ 


CJ 


CJ 


CJ 




CJ 


u 


u 


CJ 










H 




< 


< 




ft 


O 


CJ 


u 


u 


u 


r- 


CJ 


u 


CJ 


CJ 


Vf) 


CJ 


u 


CJ 


CJ 




CJ 


u 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


(J 




CJ 


u 


u 


CJ 




CJ 


CJ 


CJ 


CJ 




u 


u 


CJ 


CJ 


O 


ft 


< 


ft 


< 


VD 


CJ 


u 


u 


u 


vD 


CJ 


CJ 


CJ 


CJ 




H 






H 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


(J 


u 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


u 


CJ 




CJ 


o 


o 


CJ 




< 


ft 


< 


ft 


o 


CJ 


CJ 


CJ 


CJ 


lO 


(J 


CJ 


CJ 


CJ 


vD 


L 




E- 








(0 


CO 


CJ 




r ) 




CJ 






V-/ 


CJ 


CJ 


CJ 




f ) 


CJ 


CJ 


CJ 






CJ 


CJ 


CJ 










< 




(J 


CJ 


CJ 


CJ 


O 


CD 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 


VD 


E-* 


E- 




E— 




CJ 


CJ 


CJ 


CJ 




< 


< 


< 


< 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


u 




CJ 


CJ 


CJ 


u 




CJ 


CJ 


CJ 


CJ 




u 


CJ 


CJ 


u 


O 


CJ 


CJ 


CJ 


u 


ro 


CJ 


CJ 


CJ 


CJ 


vD 


CJ 


CJ 


CJ 


CJ 




ft 


ft 


ft 


< 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




ft 


ft 


< 






CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


u 


o 




3 


5 


3 


CM 


CJ 


a 


CJ 


CJ 


\D 




H 










ft 


< 


< 




u 


CJ 


CJ 


CJ 






CJ 


CJ 


CJ 




< 


< 


< 


e£ 




CO 


CJ 


CJ 


CJ 








< 


< 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 


cn 


< 


< 


< 


ft 


o 


2 




2 


2 


vD 


CJ 


CJ 


CJ 


CJ 


cn 


cn 








O 


o 


o 


o 


O 




v£> 


VD 


VD 


VD 




CM 




- 


0) 




Lu 


r- 1 


CM 


3 




CO 


Cu 


Cu 








CJ 


CJ 


c 




"D 




— 


<y 




O 


CM 


CM 


w 






Lu 




c 




CO 


CO 


0 




ro 






CJ 




CJ 


■o 


T3 








o 


0 










r 










XJ 








o 


O 








M 










Cu 


Cu 








CP 


CT 








/TJ 


fD 








CJ 


CJ 





o 


ft 


< 


< 


ft 


VD 


U 


CJ 


u 


CJ 


r- 


U 


o 


CJ 


CJ 










< 




CJ 


CJ 


CJ 


CJ 










H 




cC 


*c 


eC 


< 




CJ 


CJ 


CJ 


CJ 


O 


CJ 


CJ 


CJ 


CJ 


m 


H 


{— 


H 


E— 


r- 


CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


o 


CJ 


CJ 




E- 1 


r— 1 




f , 




<c 










CJ 


CJ 


VJ 


VJ 




*x 




x< 
«. 




o 


CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




ft 




«. 






CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




< 




Jl 

a. 


ft 




u 


o 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




e- 


H 


f-« 






u 


CJ 


CJ 


u 


o 


u 


u 


o 


o 


CO 


o 


CJ 


u 


CJ 


r- 


ft 


ft 


< 


< 




CJ 


CJ 


CJ 


u 




CJ 


CJ 


CJ 


CJ 




< 


ft 


ft 


< 




CJ 


u 


u 


u 




CJ 


CJ 


CJ 


u 




ft 


ft 


ft 






CJ 


CJ 


CJ 


u 


o 


CJ 


CJ 


CJ 


CJ 


CM 


< 


< 


< 


ft 


r- 


CJ 


u 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


u 


u 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


(J 




£— 


H 


E- 




o 






<c 


ft 


t— » 


u 




u 


CJ 


r- 


< 






< 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




< 


< 


ft 


ft 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 


O 


CJ 


CJ 


CJ 


CJ 


O 


CJ 


CJ 


CJ 


CJ 


r~- 


CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


u 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


CJ 


CJ 




< 


ft 


< 


ft 




CJ 


CJ 


CJ 


CJ 




CJ 


CJ 


u 


CJ 


o 


CJ 


CJ 


CJ 


CJ 


cn 


CJ 


CJ 


CJ 


CJ 


VD 


CJ 


CJ 


CJ 


CJ 




H 


E— 




E— 




< 


< 


ft 


< 


in 


CJ 


CJ 


CJ 


CJ 


CO 


< 


ft 


< 


< 


VD 


CJ 


CJ 


CJ 


CJ 












cc 


CO 


CO 


CO 


CO 


vo 


vo 


VD 


VD 


VD 




CM 






</) 




u. 


i-H 


CM 






CO 


Cu 


Cu 


to 






CJ 


CJ 


c 



O 

cn co 
nj 
CJ 



0) 

CM CM CO 



o 
o 

Cu 
Cn 
<TJ 
CJ 



CO 

T> 
0 
T 
jj 
0 
u 

Cu 
Cn 
TO 

CJ 



C 

o 

CJ 



CO 

o 

00 

CD 

LL 



WO 00/39302 



PCT/US99/31245 



128 /131 



c 

o 

u 

0) 

oo 



oo 



cm 
a 

o 



a. 
cj 

(0 

> 

~o 

o 

CP 
TO 

o 

c 

(U 
E 
c 



o 

CM 

CD 



o 

r-t 

00 



o 
o 

CD 



o 



o 

CO 

r- 



o 



i£> 

r- 



< < 

CJ CJ 

< < 
CJ CJ 
CJ CJ 
U (J 
CJ CJ 
H H 

o cj 
o u 

< < 

cj cj 

22 

u y 

22 

cj cj 

E— H 
U CJ 

u u 

o cj 

cj cj 

cj u 

U U 
U U 

e- h 

<c ^ 
a u 

E- 6- 

< ft 
cj cj 
cj cj 

E- H 

cj cj 
cj cj 

CJ u 

cj cj 

22 

a (J 

(J u 

<c < 

o o 

< ft 
cj o 

U CJ 

cj o 
cj cj 
cj cj 

E- H 

o a 

u u 

u u 

a u 

u cj 

ft < 
u u 

CJ CJ 

U CJ 

U CJ 

U CJ 

U CJ 

CJ CJ 

22 

u u 
< 



< < 

CJ CJ 
H H 
ft ft 

cj cj 

o o 

u u 

cj cj 

cj cj 
u u 

< < 

CJ CJ 

£5 

u CJ 

22 

CJ CJ 
CJ (J 

u u 

CJ CJ 

CJ cj 

CJ CJ 

H H 

CJ U 

U U 



CM 



CJ 
H 
< 



CJ CJ 
CJ O 

CJ CJ 

CJ CJ 

CJ u 

CJ CJ 

22 

CJ CJ 

ft ft 

E- H 

U U 

f-» H 

<C < 

CJ CJ 

< < 

CJ cj 

U CJ 

CJ cj 

CJ CJ 

CJ CJ 

H H 

CJ CJ 

U U 

CJ CJ 

CJ u 

U CJ 

t- H 

ft < 

u u 

U CJ 

u u 
u u 

U CJ 
CJ CJ 
CJ U 

22 

CJ CJ 

ft ft 



v£> v£> vx) v£> 
r- r* r- r** f*^ 



CM 



— tn 



o 



DOC 
— — Q) 

cm cm cn 



Cn to 
»o • 
CJ T5 
O 

4-1 

o 
1-1 

CU 

Cn 

TO 

CJ 



U, 

CO 

O 

O 
M 

a, 

TO 

CJ 



c 
o 

CJ 



c 

o 

-r-t 

a 

0) 



fN cj CJ cj cj 
^ *S < 'S 3 rt 

cn <£ *c <t 

CJ CJ CJ CJ 

<c < < < 

H H H H 

CJ CJ CJ CJ 

H H H H 

H H H H - 

CJ CJ CJ CJ CO 

O CJ CJ CJ CJ 

O CJ CJ CJ CJ 

cj o cj cj 
<z < < < 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

H £-* H f— 

CJ CJ CJ CJ 

CJ CJ u u 

< < < < 

O H H H H 

CT\ CJ CJ CJ CJ 

CO < < < < 

CJ CJ CJ CJ 

CJ CJ U CJ 

CJ CJ CJ CJ 

u CJ U CJ 

CJ CJ (J CJ 

H H H f- 

H H H H 

O CJ CJ CJ CJ 

co CJ CJ O CJ 

CO CJ CJ CJ CJ 

CJ CJ CJ CJ 

< < < < 

CJ CJ CJ o 
CJ CJ CJ CJ 

CJ O CJ CJ 

o a cj cj u 

r- CJ CJ CJ CJ 

co CJ CJ CJ CJ 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

<c < < < 

CJ U CJ u 

CJ CJ CJ CJ 

CJ CJ CJ cj 

O CJ CJ CJ u 

vX) CJ cj CJ CJ 

0D H H h H 

<t < < < 
u u u u 

< < < < 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

H H H H 

CJ O CJ CJ 

u CJ O CJ 



GO CJ 
00 CJ 



o 

00 



< < - 

CJ CJ CJ cj 

CJ CJ CJ CJ 

< < < < 
u cj cj a 

CJ CJ CJ CJ 

< < < < 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

cj u cj a 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

CO CJ CJ CJ CJ 



CJ 
CJ 
CJ 
CJ 

< 
< 

CJ 

«c 

CJ 
CJ 

CJ 
CJ 
H 



cn CJ 
CJ 
H 

u 
u 

CJ 

a: 

O CJ 

cr. cj 
u 
o 
< 

CJ 
H 

CJ 
CJ 
O H 
in u 

^2 



CJ 

2 



CJ 
H 
CJ 
O CJ 
^ rtl 
cn CJ 
CJ 

< 
u 

CJ 
CJ 

< 

CJ 
O CJ 
ro CJ 
cn cj 

< 

u 

CJ 

<c 

CJ 

CJ 
O CJ 
CNJ CJ 

cn cj 

u 

CJ 

f~ 

o 
m u 

H U 

cn < 



o u 

U CJ 

22 

u u 

CJ CJ 
CJ CJ 
CJ CJ 

22 

CJ CJ 

< ft 

CJ CJ 
CJ CJ 

CJ CJ 
CJ CJ 

u u 

CJ CJ 
H f- 

CJ cj 

CJ CJ 
CJ CJ 

< *c 

CJ CJ 

< < 

CJ CJ 

u u 

CJ CJ 

< < 

CJ CJ 

< < 

CJ CJ 

CJ CJ 

H H 

CJ u 

22 

CJ CJ 

22 

CJ CJ 

CJ cj 
CJ CJ 

< < 

CJ CJ 
CJ CJ 

< < 

U CJ 
CJ CJ 
CJ CJ 

< < 
u u 

U CJ 
CJ cj 
CJ CJ 

a. < 
u u 

CJ CJ 

< < 

CJ CJ 
H H 

u u 

CJ CJ 



CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

H H 

CJ CJ 

CJ u 

u u 

< < 



CJ ^ 

u 

2 c 

u o 

CJ H 
CJ 4-) 

CJ u 

*£ 0J 

< CO 
CJ 

< 

u 

CJ 
f— 
CJ 
CJ 

u 

CJ 

u 
u 
u 
< 

CJ 

< 

CJ 

u 

CJ 

< 

CJ 
E— 

< 

CJ 
CJ 
H 

u 

2 

CJ 

2 

CJ 

CJ 
CJ 

< 

CJ 
CJ 

< 

u 

CJ 
CJ 

< 
u 
(J 

CJ 
CJ 

< 
u 

CJ 

< 

CJ 
E— 

u 

CJ 
CJ 
CJ 
CJ 
CJ 
E— 

u 
u 
u 
< 























r- 


r- 


ro 


ro 




m 


ro 


rn 


ro 


m 


*— t 




r— 1 


r— < 


CO 


00 


00 


00 


cn 


cn 


cn 


cn 


















OJ 






to 










U 


f—i 


CN! 






u. 




Cv) 


CO 


Cu 


Cu 






CO 


Cu 


Cu 


CJ 


CJ 


c 




TJ 


CJ 


CJ 


"O 














o 


CM 


cm 


V) 




O 


CM 


OJ 




U, 


U, 


c 










cr co 


CO 


o 




Cn CO 


CO 


TO 






CJ 




TO 






CJ 


-o 


X) 






CJ 


XI 


XJ 


o 


O 








O 


0 














£ 








4J 










j-» 




0 


o 








0 


0 




u 


u 








u 


i-( 




CU 


Cu 








cu 


Cu 




cn 


cn 








CT 


Cn 




TO 


TO 








TO 


TO 




CJ 


CJ 








CJ 


CJ 



tn 

D 

tn 
c 
at 

c 

o 

CJ 



tt H H 

V£> CJ CJ 
O CJ CJ 
h U O 
CJ CJ 
CJ CJ 

< < 

CJ u 

u u 

CJ CJ 

o u cj 

m CJ CJ 
O CJ CJ 

H U U 

U CJ 

< < 

CJ CJ 

< < 
CJ CJ 

O H H 

< «C 
O CJ CJ 

M < < 

CJ CJ 
CJ CJ 

< < 

CJ CJ 

CJ CJ 

H H 

O CJ CJ 

m CJ U 

C CJ CJ 

r-i cc ft 

U U 

U U 

CJ CJ 

CJ CJ 

CJ U 

CJ CJ 

O U CJ 

cm CJ CJ 

o cj u 

r-. U U 

CJ cj 

CJ CJ 

CJ CJ 

H H 

u u 
o u u 

CJ CJ 
CJ CJ 

2 

CJ CJ 
H H 
U U 

u cj 
< 

O CJ CJ 

ouu 
- <t <t 

CJ CJ 



"2 



o 



CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

< ft 

U CJ 
CJ CJ 
CJ CJ 
H H 
CJ u 
CJ CJ 
CJ CJ 
CJ CJ 
U CJ 

< < 

CJ CJ 

< <c 

CJ CJ 

< < 

CJ CJ 

< ft 
CJ CJ 
CJ CJ 

< ft 

CJ CJ 
CJ CJ 

o u 

CJ u 
U CJ 

< < 
u u 

U CJ 

CJ CJ 

CJ CJ 

CJ u 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

E— H 

U CJ 

H H 

CJ U 

CJ CJ 

CJ CJ 

22 

CJ CJ 

U CJ 

CJ CJ 

H E- f~ 

£C < < 

CJ CJ 

U CJ 

< < 

CJ cj 



in 



222 

CJ CJ CJ CJ 

CJ CJ CJ CJ 

H f- E- H 

CJ CJ CJ CJ 

< < < ft 

cn CJ CJ CJ CJ 
00 CJ (J CJ CJ 
cn u CJ U U 

cn cn cn cn cn 
co co oo co co 
cn cn cn cn cn 



CO Cu 
• CJ 

xj — 
o 

Z 



CM 

Li- 
en CO 
TO • 
CJ XI 

o 

o 
u 

Cu 
cn 

TO 
CJ 



— CO 
CM 3 

cu in 

CJ c 

— <u 

CM 10 



u 

CO 
XJ 

o 
2: 
xj 
o 

u 
Cu 

cn 

TO 

CJ 



c 
o 

CJ 



O CJ 

r-l CJ 

H 
< 
U 
O E— 
m < 

CJ 
-H CJ 
< 

CJ 

u 

CJ 
CJ 

u 
o u 

CM CJ 



CJ 
CJ 

< 

CJ 

CJ 
CJ 

ft 

CJ 
CJ 
CJ 

ft 

CJ 
E— 
< 
CJ 

*u 

CJ 
CJ 

< 

CJ 
CJ 

u 

CJ 
CJ 
E-> 



O 

o 



o 

CO 

o 



cn u 
o CJ 

CJ 

u 

CJ 
CJ 

u 
u 

CJ 
CJ 

ft 
< 

u 
< 
a 
u 

CJ 
CJ 
CJ 
CJ 
CJ 
CJ 
CJ 
CJ 
lO CJ 
\o CJ 
O CJ 

<-! CJ 



CJ CJ CJ 

H H H 

< < < 
CJ CJ CJ 
H H H 
ft < ft 

u u u 

H H H 

< < < 

CJ u u 

CJ CJ CJ 

ft ft ft 

CJ CJ CJ 

CJ u u 

CJ CJ CJ 
CJ CJ CJ 

o cj a 
u u o 

U U CJ 

222 

CJ CJ CJ 
(JUL) 

< < < 

CJ CJ CJ 

H h h 

CJ CJ CJ 

CJ CJ CJ 

ft ft ft 
uuu 

CJ U CJ 

CJ CJ CJ 

<C tf < 

CJ CJ CJ 

H H E- 

ft ft ft 

CJ CJ CJ 

uuu 

CJ CJ CJ' 
CJ CJ CJ 

ft < ft 

CJ CJ CJ 

uuu 
uuu 

CJ CJ CJ 
CJ CJ CJ 
f— • 6— • f— 

uuu 

CJ CJ CJ 
H H t- 
CJ CJ CJ 

uuu 

CJ CJ CJ 

uuu 
uuu 
uuu 

CJ CJ CJ 
CJ CJ CJ 



o 
r- 
o 



uuu 
ft ft < 
uuu 
uuu 

CJ CJ CJ 
CJ cj cj 

uuu 
uuu 
uuu 
uuu 

CJ U CJ 
CJ CJ CJ 

uuu 

CJ CJ CJ 
CJ CJ CJ 
CJ CJ CJ 



in in to in in 

V£> CO VD »X> 

o o o o o 



c\t - — - - — - 



CO Cu 

\ CJ 
x> ~ 

O CM 

£ u- 

cn co 

TO 

CJ XJ 

o 
z 

u 

o 
1-1 

Cu 
cn 
TO 

CJ 



10 
CM P 
Cu v) 

u c 

— 0) 
CM 10 



CO 
XJ 

o 
£ 

o 
u 
a 
Cn 

TO 
CJ 



c 

o 
u 



o 

o 

00 

d 



KTmSTJTIJTF XWFFT fRTll F 26) 



WO 00/39302 



PCI7US99/31245 



129/131 



CM 
CL 
CJ 



CL 

cj 

to 
> 

T3 

O 

cr 
cj 



c 

B 
c 
cp 



c 

o 



u 

0) 
CO 



cj cj 
cm o d 

r-i < CC 

o u 
CJ u 

cj cj 
u u 

U U 

< < 

u u 

< < 
u u 
u u 
(J cj 
cj cj 
cj cj 

< < 
cj cj 

CJ CJ 

33 

u u 
o o cj 

0> CJ Cj 
»-h U U 

O O 

E- 

u u 



o 
o 

CM 




333 



u u 

cj cj 

cj cj 

U U 

cj a 

E- H 

u u 



u u 

O H H 

co U U 

^ cj cj 

rt H h 

cj cj 

3 

u u 

cj cj 

u u 

u u 

< < 

cj cj 



u u 

f— t— 

H E- 

u u 

cj cj 

cj cj 



333 



o 



u u 

E— H 

o cj 

U (J 

o u 

< < 

cj cj 



333 



o 



CJ CJ 

a cj 

u u 

o o 

< < 

u a 

u u 



cj cj 

(J cj 

u u 

cj cj 

< < 

U u 

u u 



333 



CJ (J 

cj cj 

u u 

o u 

E— E- 

O H H 

n u o 

"33 

CJ u 

cj cj 

CJ CJ 

(J u 

CJ e> 

^ CJ CJ 

<T O O 

— < < 

h U U 



U CJ 
CJ CJ 

u u 

U CJ 
t- 1 E- 

CJ CJ 

33 

CJ CJ 

CJ CJ 

CJ cj 

U CJ 

CJ CJ 

u u 

CJ CJ 

< < 

(J CJ 



o 



CJ 
H 

< 
o 
<c 
u 
u 
< 
u 
u 

CJ 
CJ 

o 



CM CJ 

CJ 
CJ 

u 

CJ 
CJ 
CJ 
CJ 
H 



o 
m 



CM CJ 

(J 

u 

CJ 
CJ 
i— 

u 

CJ 
O H 
CJ 
CM CJ 
CJ 
CJ 
< 

CJ 



CJ CJ u 

u u u 

< < < 
u u u 

< < <£ 

CJ CJ CJ 

CJ CJ CJ 

333 

CJ CJ CJ 

(J CJ CJ 

CJ CJ CJ 

(J CJ CJ 

CJ u u 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

H H H 

CJ CJ CJ 

CJ CJ CJ 

u u u 

CJ CJ CJ 

CJ CJ CJ 

H H H 

CJ CJ CJ 

CJ CJ CJ 

H H H 

U U CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 



33 



CJ CJ CJ 



O CJ 
m cj 

CM CJ 

^ CJ 

u 

CJ 
CJ 
CJ 
CJ 
CJ 
CJ 
CJ 
CJ 
r- (J 
— t H 
CM CJ 



CJ CJ CJ 

CJ CJ CJ 

CJ CJ u 

U CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ u u 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ u 

CJ CJ CJ 

H f- H 

u u u 

< <c <c 



u 

CJ 
CJ 
ro (J 

CJ 

u 

CJ 

u 
u 

CJ 
O CJ 

m cj 
m < 

CJ 
U 

u 
< 

o u 

eg (J 
m cj 
cj 
CJ 

CJ 

< 

CJ 



CJ 
CJ 
CJ 

< 

CJ 
CJ 
CJ 
CJ 



CJ 
CJ 
CJ 



rO U 
^ CJ 
CJ 
CJ 

U 
(J 

U 

3 

(J 

CJ 
CJ 

m cj 
cn < 
cm CJ 
U 



CJ CJ 
CJ CJ 
CJ CJ 

< < 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

33 

CJ CJ 
CJ CJ 
CJ CJ 

33 

a cj cj 

< < < 

H H H 

U O CJ 

CJ CJ CJ 
{—He— 

E- 6- H 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ CJ 

f— H t— 

CJ CJ CJ 

f— H 

< <C < 
CJ CJ CJ 

^3^ 



o 
o 



CJ CJ CJ 
CJ CJ CJ 
CJ CJ CJ 

<c < < 

fr* 
H 
H 



E- 



33 

H E- £- 

CJ CJ CJ 

CJ CJ CJ 

CJ CJ O 

< < < 
U CJ CJ 

< < < 



O 

o 



CJ CJ 

< < 

u u 

U E- 

CJ CJ 

O < H 

CM U U 

^ u u 

h U U 

U H 

O O 
< 

u 

U CJ 

< 

O CJ 

< 

•— * CJ CJ 

< < 

CJ CJ 
CJ CJ 

< < 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

a h 

H H 

H H 

U CJ 

CJ CJ 

u < 

CJ CJ 

H H 

O E— H 

CT\ CJ CJ 

m O CJ 

^ < < 

CJ CJ 

< < 
CJ CJ 

S3 

CJ CJ 

o U < 

CD CJ CJ 

m cj a 

r-i U <C 

o u 
a a 
o cj 

CJ CJ 
CJ CJ 

u < 

D U 
cn < *r 

^ U < 
m U CJ 

^ u u 




CJ CJ 

o cj 

CJ CJ 

< (Z 
CJ CJ 
CJ CJ 
CJ CJ 
CJ CJ 
H E~ 
H H 
E- f- 
CJ CJ 
CJ CJ 

< < 

CJ CJ 

CJ CJ 
CJ CJ 

< < 

CJ CJ 

< < 

CJ CJ 

33 

CJ CJ 

< < 

CJ u 
U CJ 

< < 

u u 

CJ CJ 

CJ CJ 

CJ CJ 

CJ CJ 

< < 

CJ CJ 

33 

CJ CJ 
CJ CJ 



< 

CJ 

< 

CJ 



CTi CJ 
CJ 

^3 

CJ 
CJ 
CJ 
CJ 

O H 
CD CJ 



o 
r- 



CJ 
CJ 
CJ 

CJ 
CJ 

o 



< 
o 

O £- 
<£> U 
CJ 
— i CJ 

u 
u 
< 

H 
CJ 
fr* 

o a 

lO O 

CJ 
CJ 



u o 

CJ CJ cj 
CJ CJ CJ 

u o 

CJ CJ CJ 
CJ CJ CJ 

u u u 

H H E^ 

< < < 

cj cj cj 

CJ CJ CJ 

333 

H H H 
CJ CJ CJ 

< < ' 
u u 

< «c 
u u 

CJ CJ CJ 

U U CJ 

H h h 

U CJ o 

O U CJ 

CJ CJ CJ 

CJ CJ CJ 

< d: < 

CJ CJ CJ 
CJ CJ CJ 

33 

CJ CJ 
CJ CJ 
CJ CJ 
E- 
H 
E- 
U CJ 
H H . 
U U U 

< < < 
CJ CJ CJ 

< < 

CJ CJ 

< < 
CJ U CJ 
H H H 

CJ u u 

CJ U CJ 

CJ u u 

H H 

E- H 

CJ u 

33 



E— 
E-< 



< 
U 
CJ 
CJ 
E- 
f— 
E- 
U 
H 



H 

CJ 



H 
CJ 



H 
H 



E- 



in 



H H 
E- H 
E- H 

CJ u u 
UUU 

< < 

H H 

CJ CJ CJ 

H E~ H 

333 

CJ CJ CJ 
CJ CJ CJ 

3 33 3 



Q 
O 
00 

d 

LL 



UUU 

< < < 



<r V 



r- r- r- r* 

CM CM CsJ CM rsj 



n n n ro n 

<Ti CTt CTi CT\ 
(M CM CM CM CM 



CTi 0> CTi CH CTi 
V£) sX> V£> VD 

m m n ro m 



tn lO m iT> lO 
v ^ 
^ ^ t ^ 



cm •—■ • 

Lu i-t CM 

CQ &j Cu 

• CJ CJ 
T) 
0 



CM CM 

cn V) 

cj -o n 

o o 

4-» 

o o 

u u 

Cu Or 

m ro 

CJ CJ 



in 
a 
w 
c 

<D 
V) 
C 

o 
u 



CM — - - — > 

U. CM 

CO Cu D-i 

• CJ CJ 

*D 

0 

s 



CM CM 

CP CO c/) 

rtj • ■ 

CJ *D *D 

O O 

X £ 

o o 

u u 

Cu a. 
cn cn 

co ro 

CJ CJ 



01 

c 

<u 
(/) 
c 
o 

CJ 



CM 



r\) — — 



CM 

CO Cu Qj 
CJ CJ 



"0 

o 



CM CM 

cn CO co 

ID 

CJ TJ T3 

O O 

O O 

u u 

a* a, 

CP Cn 

m cu 

CJ cj 



V) 



U 



CO Cu 
• CJ 
*D 



to 

CM O 
Cu CO 
CJ 



O CM 

cn co 
co • 
CJ 73 
O 



O 
u 
Cu 
CP 
fTJ 
CJ 



c 

CM « 



CO 

*D 

O 

o 

Cu 

a> 

CD 

cj 



c 
o 
u 



CM — 
CO Cu 



— CO 
CM 3 
Cu 10 



0 
2 



CJ CJ c 
' cu 

CM CO 



cn CO 

CO 



CJ 



o 
r 

o 
u 

Cu 
CP 
CO 
CJ 



CO 

o 

o 
u 

Cu 

CP 
(0 

CJ 



c 

o 
u 



SUBSTITUTE SHEET (RULE 26) 



WO 00/39302 



PCT/US99/31245 



130/ 131 



CM 
Cu 

cj 



Cu 

cj 
> 
o 

CP 
TO 
CJ 

c 

QJ 
E 
c 

CP 



r-4 VO 

cm on 

ID 
C «-< 

O 

•H 

u 

0) 
CO 



o 

CO 
ID 



o 
r- 



o 

VD 



o 



o 



O 

m 
in 



CM 



2 



222 



u u u 

E- H H 

(J O U 

< < < 

u cj u 



CTi 




< 

o < 

cj u> 

U H 

u u 
u y 

CJ CJ 

< < 

cj cj 

CJ (J 

< <c 
cj cj 
u u 

< < 

cj cj 

u u 

o o 

(J cj 

u < 

cj o 

E- H 

cj cj 
cj o 

22 

cj cj 

H H 
*t 
L> U 
E— H 
H H 
U CJ 

cj cj 

cj cj 

o cj 

cj o 

cj cj 

cj cj 

cj o 

cj o 

o cj 

u o 

cj cj 

cj cj 

u < 

E- H 

< < 
cj cj 
E- H 



cj 

cj 
u 

< 
cj 
< 
u 
cj 
< 
CJ 
u 
< 

cj 
< 
u 
o 
cj 

cj 
{- 
cj 
CJ 

2 

u 

< 

CJ 
H 
H 
U 
CJ 
CJ 
CJ 
CJ 
(J) 
U 
H 
< 
CJ 
CJ 

cj 




o 

ro 



O 

CM 



o 
t — i 



O 
O 



o 



o 

GO 



m 



!5 



CJ 



oo 



< < 
u u 

U (J 

u u 

U (J 

o o 
u o 

< *c 

< <: 
o o 
o e> 
o o 
o o 
u u 

U (J 



(J OJ 

u 

u c 
(J o 

o *j 

u u 

a. a> 

o co 
o 

< 
o 
o 
o 
o 
u 
u 
o 



o o o 

H E- H 
U H 

o o o 
e> o c 

f— E- H 

O O C- 
U U U 

uu o 
u u u 
o o c 

E— H f- 

O O O 

o e> o 
u u u 

< <C 

e> o o 

< <c 
o o o 

U H 

H E— 

< < 

O H 

u u u 

U U L> 
V U U 

o o o 

< < < 

u u u 

H H H 

< SC < 
U U CJ 
U U U 

u u u 
u u u 

H E- H 
H H H 

U U O 

o o c 

E— H E- 
U H 

CJ U (J 

CJ (J cj 
*c < < 

O U CJ 

u u o 

h e- h 

(J u r 

o o o 

o o o 



00 



o 

CD 



lO 
cm 

CD 



CJ 
CJ 



H H H 

O LD O 

H H H 
O H 
CJ < 
CJ u 

CJ CJ 

o o o 

o o o 

E — ■ E— t— 

o < 

<c < <c 

u u u 

LD O CJ 

CJ CJ CJ 

e- h s- 

o o o 

o CJ o 

sC <C < 



LU 
O 
CO 

d 



^-t o *™* »~ ' *~< 
CN <N C>J C\) 
lH lD 01 t-O lO 



o r- 

0> rH CT) ON CTi 
lO lT) lT) U*> iTl 



ro o no m m 
<— i r- r- r* 
\o in vd \£> y3 



cn o cn CTi ct> 

■t <— i ^t* 

in r r- 



lT) o m iTi lO 
C\J i— i CM C\J CM 
CD lD OO CD CD 



CM — 
Ut i— • 

CO cu 



— * tn 

CM 3 



O 



CJ o c 



cr> co 

01 



T5 
0 

o 

U 
CU 

cn 
m 
cj 



CO 

T> 
O 

o 

Cu 

cn 

o 



c 
o 

CJ 



cn ^- — 

lu h fM 

LO Cu Cu 

• O CJ 

*D " — 

O (N (N 

s u u 

CP CO CO 



O T3 



0 



c 
o 

c 

o 

CJ 



o o 

u u 

Cu Cu 

o o 



CM • — • ' — ■ 

Lu • — I CM 

CO CU Cu 

• O CJ 

TJ 

0 CM CM 

2 U Lu 

a* co lo 

fU * 
O *D 
O 
2 



■o 

O 



tn 

tn 
c 
<u 
tn 
c 
o 

CJ 



o o 

u u 

CU CL 

CP CT 

m fo 

CJ CJ 



CM 

L0 Cu 



— tn 

CM D 

cu tn 



T3 
O 



CJ CJ c 

CM V) 



CP CO 
f0 * 
CJ TJ 
0 

o 

CU 
CP 

TO 

CJ 



Lu. 
CO 

X) 

o 
2: 

o 

u 
Cu 
CP 
TO 
CJ 



c 
o 



CM 
Lu 

CO Cu 



— in 

CM 3 

a. to 



o 



CJ CJ C 

■ 0) 

CM CM LO 



CP CO 
TO • 
CJ T3 

o 
s: 

i-> 
O 
i-( 
Cu 

CP 
fD 

CJ 



Lu C 

CO 0 
■ (J 
TJ 
O 

O 

Cu 

CP 
TO 
CJ 



snRsrrrrrTF shfft (rule 26) 



WO 00/39302 



PCT/US99/31245 



131/131 



c 

c 

CO 

© 
c 

E 



u 2 



H 
< 

u 
u 

2 
D 

b y 

< r- 

U U 

a o 

< h 
< 



< 
u 

u 

< 
u 
u 
o 



u 
u 

5 t5 



< 
< 



u 
o 
u 
u 
a 
u 
o 
< 
u 

a 
l> 
u 
a 
u 
o 
< 
< 

< 

< 
u 

U 

u 

r — 
w 

o 
u 
< 



u a 

o < 

< < 
o u 
r o o 

r- r- < 

u < 

u u 

u a 

u t- 

u u 

u < 

< < 
o u 
c u 
- < 
o u 

U r- 

u u 

a u 

< a 
o u 

< 

< o 



w 

< 
u 
o 
a 
a 

u 
u 
o 
a 
c 
< 
< 
u 

< 
u 

< 



00 o 



CM 
CD 

UL 

co 

i 

CM 
CM 
(/) 
>^ 

o 

75 



UJ 
CO 
D 
£L 



a: 
q: 
o 
a: 
a: 
^ 

q: 
cd 
> 

CO 

CD 
-j 

CD 



a 
> 
o 
o 

X 



N 
X 

> 

Q- 

a 

LU 



a £ 



CD 
< 

a 

a 



> 

CO 
UJ 

a 
CD 
H 

a 
o 



CO CD 
CDg 



a 
x 



CL 

O 

CO 

5 < 

a o. 
lu o 



a: 
a 
o 
> 
a 

LU 



a 

CO 

> 
O 
X 

> 



0O 



o 

2: 



VTJRSTTTTJTF SffPFT {Mil V ?A> 



WO 00/39302 



PCT7US99/3I245 



SEQUENCE LISTING 



<110> Chiron Corporation 

<120> IMPROVED EXPRESSION OF HIV POLYPEPTIDES AND PRODUCTION 
OF VIRUS-LIKE PARTICLES 

<130> 1621.100 

<140> 
<141> 

<160> 90 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 1509 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 1 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 120 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 180 
ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 240 
acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 
ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 360 
gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 42 0 
caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 480 
gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 
gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 
caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 
catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 720 
actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 780 
ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 840 
cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 
cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 
gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 1020 
ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 1080 
aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 1140 
cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 1200 
gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 1260 
gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 1320 
ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 1380 
ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 1440 
atagacaagg aactgtatcc tttaacttcc ctcagatcac tctttggcaa cgacccctcg 1500 
tcacaataa 1509 

<210> 2 
<211> 1845 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 2 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 120 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 180 
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ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 240 

acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 

ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 360 

gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 420 

caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 480 

gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 

gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 

caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 

catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 720 

actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 780 

ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 840 

cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 

cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 

gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 1020 

ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 1080 

aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 114 0 

cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 1200 

gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 1260 

gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 1320 

ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 1380 

ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 144 0 

atagacaagg aactgtatcc tttaacttcc ctcagatcac tctttggcaa cgacccctcg 1500 

tcacaataag gatagggggg caactaaagg aagctctatt agatacagga gcagatgata 1560 

cagtattaga agaaatgaat ttgccaggaa aatggaaacc aaaaatgata gggggaattg 1620 

gaggttttat caaagtaaga cagtacgatc agatacctgt agaaatctgt ggacataaag 1680 

ctataggtac agtattagta ggacctacac ctgtcaacat aattggaaga aatctgttga 1740 

ctcagattgg ttgtacttta aatttcccca ttagtcctat tgaaactgta ccagtaaaat 1800 

taaagccagg aatggatggc ccaaaagtta agcaatggcc attga 1845 

<210> 3 
<211> 4313 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 3 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 

ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 120 

ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 180 

ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 240 

acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 

ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 3 60 

gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 420 

caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 4 80 

gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 

gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 

caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 

catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 720 

actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 780 

ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 84 0 

cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 

cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 

gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 1020 

ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 1080 

aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 1140 

cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 1200 

gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 1260 

gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 1320 

ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 1380 

ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 1440 
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atagacaagg 
tcacaataag 
cagtattaga 
gaggttttat 
ctataggtac 
ctcagattgg 
taaagccagg 
aagcattagt 
ctgaaaatcc 
gaaaactagt 
taggaatacc 
gtgatgcata 
tacctagtat 
gatggaaagg 
gaaaacagaa 
acttagaaat 
ggggatttac 
atgaactcca 
ggactgtcaa 
cagggattaa 
taataccact 
aaccagtaca 
aggggcaagg 
gaaagtatgc 
tgcaaaaagt 
ccatacaaaa 
ctgagtggga 
aacccatagt 
taggaaaagc 
caacaaatca 
aagtaaacat 
agagtgaatc 
acctggcatg 
tcagtgctgg 
atgagaaata 
tagcaaaaga 
gacaagtaga 
ttatcctggt 
agacagggca 
caatacatac 
gggcagggat 
aatctatgaa 
ttaagacagc 
ggggatacag 
aactacaaaa 
atcccctttg 
aagataatag 
gaaaacagat 



aactgtatcc 
gatagggggg 
agaaatgaat 
caaagtaaga 
agtattagta 
ttgtacttta 
aatggatggc 
agagatatgt 
atacaatact 
agatttcaga 
acaccccgca 
cttttcagtt 
aaacaatgag 
atcaccagca 
tccagacata 
agggcagcat 
cacaccagac 
tcctgataaa 
tgacatacag 
agtaaagcag 
aacagaagaa 
tgaagtatat 
ccaatggaca 
aaggatgagg 
atccacagaa 
ggaaacatgg 
gtttgtcaat 
aggagcagaa 
aggatatgtt 
gaagactgaa 
agtaacagac 
agagttagtc 
ggtaccagca 
aatcaggaaa 
tcacagtaat 
aatagtagcc 
ctgtagtcca 
agcagttcat 
ggaaacagca 
agacaatggc 
caagcaggaa 
taatgaatta 
agtacaaatg 
tgcaggggaa 
gcaaattaca 
gaaaggacca 
tgacataaaa 
ggcaggtgat 



tttaacttcc 
caactaaagg 
ttgccaggaa 
cagtacgatc 
ggacctacac 
aatttcccca 
ccaaaagtta 
acagaaatgg 
ccagtatttg 
gaacttaata 
gggttaaaaa 
cccttagata 
acaccaggga 
atattccaaa 
gttatctatc 
agaacaaaaa 
aaaaaacatc 
tggacagtac 
aagttagtgg 
ttatgtaaac 
gcagagctag 
tatgacccat 
tatcaaattt 
ggtgcccaca 
agcatagtaa 
gaagcatggt 
acccctccct 
actttctatg 
actgacagag 
ttacaagcaa 
tcacaatatg 
agtcaaataa 
cacaaaggaa 
gtactatttt 
tggagagcaa 
agctgtgata 
ggaatatggc 
gtagccagtg 
tattttctct 
agcaatttca 
tttggcattc 
aagaaaatta 
gcagtattca 
agaatagtag 
aaaattcaaa 
gcaaagcttc 
gtagtgccaa 
gattgtgtgg 



ctcagatcac 
aagctctatc 
aatggaaacc 
agatacctgt 
ctgtcaacat 
ttagtcctat 
agcaatggcc 
aaaaggaagg 
ctataaagaa 
aaagaactca 
agaaaaaatc 
aagactttag 
ttagatatca 
gtagcatgac 
aatacatgga 
tagaggaact 
agaaagaacc 
agcctataat 
gaaaattgaa 
tccttagagg 
aactggcaga 
caaaagactt 
atcaagagcc 
ctaatgatgt 
tatggggaaa 
ggatggagta 
tagtgaaatt 
tagatggggc 
gaagacaaaa 
ttcatctagc 
cattaggaat 
tagagcagtt 
ttggaggaaa 
tgaatggaat 
tggctagtga 
aatgtcagct 
aactagattg 
gatatataga 
taaaattagc 
ccagtactac 
cctacaatcc 
taggacaggt 
tccacaattt 
acataatagc 
attttcgggt 
tctggaaagg 
gaagaaaagc 
caagtagaca 



tctttggcaa 
agatacagga 
aaaaatgata 
agaaatctgt 
aattggaaga 
tgaaactgta 
attgacagaa 
gaaaatttca 
aaaagacagt 
agacttctgg 
agtaacagta 
aaagtatact 
gtacaatgtg 
aaaaatctta 
tgatttgtat 
gagacagcat 
tccattcctt 
gctgccagaa 
ttgggcaagt 
aaccaaagca 
aaacagggag 
agtagcagaa 
atttaaaaat 
aaaacagtta 
gattcctaaa 
ttggcaagct 
atggtaccag 
agctaatagg 
agttgtctcc 
tttgcaggat 
cattcaagca 
aataaaaaag 
tgaacaagta 
agataaggcc 
ttttaacctg 
aaaaggagaa 
tacacatcta 
agcagaagtt 
aggaagatgg 
ggttaaggcc 
ccaaagtcaa 
aagagatcag 
taaaagaaaa 
aacagacata 
ttattacagg 
tgaaggggca 
aaaaatcatt 
ggatgaggat 



cgacccctcg 1500 
gcagatgata 1560 
99999 a attg 1620 
ggacataaag 1680 
aatctgttga 1740 
ccagtaaaat 1800 
gaaaaaataa 1860 
aaaattgggc 1920 
actaaatgga 1980 
gaagttcagt 204 0 
ttggatgtgg 2100 
gcatttacca 2160 
ctgccacagg 2220 
gagcctttta 2280 
gtaggatctg 2340 
ctgttgaggt 2400 
tggatgggtt 24 6 0 
aaagacagct 2520 
cagatttatg 2580 
ctaacagaag 264 0 
attctaaaag 2700 
atacagaagc 2760 
ctgaaaacag 2820 
acagaggcag 2 88 0 
tttaaactac 294 0 
acctggattc 3000 
ttagagaaag 3060 
gagactaaat 3120 
atagctgaca 3180 
tcgggattag 324 0 
caaccagata 3300 
gaaaaggtct 3360 
gataaattag 3420 
caagaagaac 3480 
ccacctgtag 3540 
gccatgcatg 3600 
gaaggaaaaa 3660 
attccagcag 3720 
ccagtaaaaa 3780 
gcctgttggt 3840 
ggagtagtag 3900 
gctgaacacc 3960 
ggggggattg 4020 
caaactaaag 4080 
gacaacaaag 414 0 
gtagtaatac 4200 
agggattatg 4260 
tag 4313 



<210> 4 
<211> 1515 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HIV-Gag 



<400> 4 
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gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 12 0 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 24 0 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 300 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 420 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 4 80 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 540 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 720 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 780 
cccgtgggcg agatctacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 84 0 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagccctt ccgcgactac 900 
gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 1020 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 114 0 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 1200 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcgagg gccaccagat gaaggactgc accgagcgcc aggccaactt cctgggcaag 1320 
atctggccca gctacaaggg ccgccccggc aacttcctgc agagccgccc cgagcccacc 1380 
gccccccccg aggagagctt ccgcttcggc gaggagaaga ccacccccag ccagaagcag 1440 
gagcccatcg acaaggagct gtaccccctg accagcctgc gcagcctgtt cggcaacgac 1500 
cccagcagcc agtaa 1515 

<210> 5 
<211> 1853 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HIV- Gag -protease 

<400> 5 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 120 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 240 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 3 00 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 420 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 480 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 54 0 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 72 0 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 78 0 
cccgtgggcg agatctacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 84 0 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagccctt ccgcgactac 900 
gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 1020 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 114 0 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 1200 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcgaag gacaccaaat gaaagattgc actgagagac aggctaattt tttagggaag 1320 
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atctggcctt cctacaaggg aaggccaggg aattttcttc agagcagacc agagccaaca 1380 

gccccaccag aagagagctt caggtttggg gaggagaaaa caactccctc tcagaagcag 1440 

gagccgatag acaaggaact gtatccttta acttccctca gatcactctt tggcaacgac 1500 

ccctcgtcac agtaaggatc ggcggccagc tcaaggaggc gctgctcgac accggcgccg 1560 

acgacaccgt gctggaggag atgaacctgc ccggcaagtg gaagcccaag atgatcggcg 1620 

ggatcggggg cttcatcaag gtgcggcagt acgaccagat ccccgtggag atctgcggcc 1680 

acaaggccat cggcaccgtg ctggtgggcc ccacccccgt gaacatcatc ggccgcaacc 1740 

tgctgaccca gatcggctgc accctgaact tccccatcag ccccatcgag acggtgcccg 1800 

tgaagctgaa gccggggatg gacggcccca aggtcaagca gtggcccctg taa 1853 

<210> 6 
<211> 4319 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HIV- Gag -polymerase 

<400> 6 

gccaccatgg gcgcccgcgc cagcgtgctg agcggcggcg agctggacaa gtgggagaag 60 
atccgcctgc gccccggcgg caagaagaag tacaagctga agcacatcgt gtgggccagc 120 
cgcgagctgg agcgcttcgc cgtgaacccc ggcctgctgg agaccagcga gggctgccgc 180 
cagatcctgg gccagctgca gcccagcctg cagaccggca gcgaggagct gcgcagcctg 240 
tacaacaccg tggccaccct gtactgcgtg caccagcgca tcgacgtcaa ggacaccaag 300 
gaggccctgg agaagatcga ggaggagcag aacaagtcca agaagaaggc ccagcaggcc 360 
gccgccgccg ccggcaccgg caacagcagc caggtgagcc agaactaccc catcgtgcag 420 
aacctgcagg gccagatggt gcaccaggcc atcagccccc gcaccctgaa cgcctgggtg 4 80 
aaggtggtgg aggagaaggc cttcagcccc gaggtgatcc ccatgttcag cgccctgagc 540 
gagggcgcca ccccccagga cctgaacacg atgttgaaca ccgtgggcgg ccaccaggcc 600 
gccatgcaga tgctgaagga gaccatcaac gaggaggccg ccgagtggga ccgcgtgcac 660 
cccgtgcacg ccggccccat cgcccccggc cagatgcgcg agccccgcgg cagcgacatc 72 0 
gccggcacca ccagcaccct gcaggagcag atcggctgga tgaccaacaa cccccccatc 780 
cccgtgggcg agatctacaa gcggtggatc atcctgggcc tgaacaagat cgtgcggatg 84 0 
tacagcccca ccagcatcct ggacatccgc cagggcccca aggagccctt ccgcgactac 900 
gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 1020 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 114 0 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 1200 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcgaag gacaccaaat gaaagattgc actgagagac aggctaattt tttagggaag 1320 
atctggcctt cctacaaggg aaggccaggg aattttcttc agagcagacc agagccaaca 1380 
gccccaccag aagagagctt caggtttggg gaggagaaaa caactccctc tcagaagcag 144 0 
gagccgatag acaaggaact gtatccttta acttccctca gatcactctt tggcaacgac 1500 
ccctcgtcac agtaaggatc ggcggccagc tcaaggaggc gctgctcgac accggcgccg 1560 
acgacaccgt gctggaggag atgaacctgc ccggcaagtg gaagcccaag atgatcggcg 1620 
ggatcggggg cttcatcaag gtgcggcagt acgaccagat ccccgtggag atctgcggcc 1680 
acaaggccat cggcaccgtg ctggtgggcc ccacccccgt gaacatcatc ggccgcaacc 174 0 
tgctgaccca gatcggctgc accctgaact tccccatcag ccccatcgag acggtgcccg 1800 
tgaagctgaa gccggggatg gacggcccca aggtcaagca gtggcccctg accgaggaga 1860 
agatcaaggc cctggtggag atctgcaccg agatggagaa ggagggcaag atcagcaaga 1920 
tcggccccga gaacccctac aacacccccg tgttcgccat caagaagaag gacagcacca 1980 
agtggcgcaa gctggtggac ttccgcgagc tgaacaagcg cacccaggac ttctgggagg 204 0 
tgcagctggg catcccccac cccgccggcc tgaagaagaa gaagagcgtg accgtgctgg 2100 
acgtgggcga cgcctacttc agcgtgcccc tggacaagga cttccgcaag tacaccgcct 2160 
tcaccatccc cagcatcaac aacgagaccc ccggcatccg ctaccagtac aacgtgctgc 2220 
cccagggctg gaagggcagc cccgccatct tccagagcag catgaccaag atcctggagc 2280 
ccttccgcaa gcagaacccc gacatcgtga tctaccagta catggacgac ctgtacgtgg 234 0 
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gcagcgacct 
tgcgctgggg 
tgggctacga 
acagctggac 
tctacgccgg 
ccgaggtgat 
tgaaggagcc 
agaagcaggg 
agaccggcaa 
aggccgtgca 
agctgcccat 
ggatccccga 
agaaggagcc 
ccaagctggg 
ccgacaccac 
gcctggaggt 
ccgacaagag 
aggtgtacct 
agctggtgag 
aggagcacga 
ccgtggtggc 
tgcacggcca 
gcaagatcat 
ccgccgagac 
tgaagaccat 
gctggtgggc 
tggtggagag 
agcacctgaa 
gcatcggcgg 
ccaaggagct 
acaaggaccc 
tgatccagga 
actacggcaa 



ggagatcggc 
cttcaccacc 
gctgcacccc 
cgtgaacgac 
catcaaggtg 
ccccctgacc 
cgtgcacgag 
ccagggccag 
gtacgcccgc 
gaaggtgagc 
ccagaaggag 
gtgggagttc 
catcgtgggc 
caaggccggc 
caaccagaag 
gaacatcgtg 
cgagagcgag 
ggcctgggtg 
cgccggcatc 
gaagtaccac 
caaggagatc 
ggtggactgc 
cctggtggcc 
cggccaggag 
ccacaccgac 
cggcatcaag 
catgaacaac 
gaccgccgtg 
ctacagcgcc 
gcagaagcag 
cctgtggaag 
caacagcgac 
gcagatggcc 



cagcaccgca 
cccgacaaga 
gacaagtgga 
atccagaagc 
aagcagctgt 
gaggaggccg 
gtgtactacg 
tggacctacc 
atgcgcggcg 
accgagagca 
acctgggagg 
gtgaacaccc 
gccgagacct 
tacgtgaccg 
accgagctgc 
accgacagcc 
ctggtgagcc 
cccgcccaca 
cgcaaggtgc 
agcaactggc 
gtggccagct 
agccccggca 
gtgcacgtgg 
accgcctact 
aacggcagca 
caggagttcg 
gagctgaaga 
cagatggccg 
ggcgagcgca 
atcaccaaga 
ggccccgcca 
atcaaggtgg 
ggcgacgact 



ccaagatcga 
agcaccagaa 
ccgtgcagcc 
tggtgggcaa 
gcaagctgct 
agctggagct 
accccagcaa 
agatctacca 
cccacaccaa 
tcgtgatctg 
cctggtggat 
cccccctggt 
tctacgtgga 
accgcggccg 
aggccatcca 
agtacgccct 
agatcatcga 
agggcatcgg 
tgttcctgaa 
gcgccatggc 
gcgacaagtg 
tctggcagct 
ccagcggcta 
tcctgctgaa 
acttcaccag 
gcatccccta 
agatcatcgg 
tgttcatcca 
tcgtggacat 
tccagaactt 
agctgctgtg 
tgccccgccg 
gcgtggccag 



ggagctgcgc 
ggagcccccc 
catcatgctg 
gctgaactgg 
gcgcggcacc 
ggccgagaac 
ggacctggtg 
ggagcccttc 
cgacgtgaag 
gggcaagatc 
ggagtactgg 
gaagctgtgg 
cggcgccgcc 
ccagaaggtg 
cctggccctg 
gggcatcatc 
gcagctgatc 
cggcaacgag 
cggcatcgac 
cagcgacttc 
ccagctgaag 
ggactgcacc 
catcgaggcc 
gctggccggc 
caccaccgtg 
caacccccag 
ccaggtgcgc 
caacttcaag 
catcgccacc 
ccgcgtgtac 
gaagggcgag 
caaggccaag 
ccgccaggac 



cagcacctgc 2400 
ttcctgtgga 2460 
cccgagaagg 2520 
gccagccaga 2580 
aaggccctga 2640 
cgcgagatcc 2700 
gccgagatcc 2760 
aagaacctga 2820 
cagctgaccg 2880 
cccaagttca 2940 
caggccacct 3000 
taccagctgg 3060 
aaccgcgaga 312 0 
gtgagcatcg 3180 
caggacagcg 324 0 
caggcccagc 3300 
aagaaggaga 3 3 60 
caggtggaca 3420 
aaggcccagg 34 80 
aacctgcccc 3540 
ggcgaggcca 3 600 
cacctggagg 3660 
gaggtgatcc 3720 
cgctggcccg 3780 
aaggccgcct 3 84 0 
agccagggcg 3 900 
gaccaggccg 3960 
cgcaagggcg 4 02 0 
gacatccaga 4 080 
taccgcgaca 414 0 
ggcgccgtgg 4200 
atcatccgcg 4260 
gaggactag 4319 



<210> 7 

<211> 2031 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HIV-Gag/HCV-core fusion polypeptide 



<400> 7 

gccaccatgg 

atccgcctgc 

cgcgagctgg 

cagatcctgg 

tacaacaccg 

gaggccctgg 

gccgccgccg 

aacctgcagg 

aaggtggtgg 

gagggcgcca 

gccatgcaga 

cccgtgcacg 

gccggcacca 

cccgtgggcg 

tacagcccca 



gcgcccgcgc 
gccccggcgg 
agcgcttcgc 
gccagctgca 
tggccaccct 
agaagatcga 
ccggcaccgg 
gccagatggt 
aggagaaggc 
ccccccagga 
tgctgaagga 
ccggccccat 
ccagcaccct 
agatctacaa 
ccagcatcct 



cagcgtgctg 
caagaagaag 
cgtgaacccc 
gcccagcctg 
gtactgcgtg 
ggaggagcag 
caacagcagc 
gcaccaggcc 
cttcagcccc 
cctgaacacg 
gaccatcaac 
cgcccccggc 
gcaggagcag 
gcggtggatc 
ggacatccgc 



agcggcggcg 
tacaagctga 
ggcctgctgg 
cagaccggca 
caccagcgca 
aacaagtcca 
caggtgagcc 
atcagccccc 
gaggtgatcc 
atgttgaaca 
gaggaggccg 
cagatgcgcg 
atcggctgga 
atcctgggcc 
cagggcccca 



agctggacaa 
agcacatcgt 
agaccagcga 
gcgaggagct 
tcgacgtcaa 
agaagaaggc 
agaactaccc 
gcaccctgaa 
ccatgttcag 
ccgtgggcgg 
ccgagtggga 
agccccgcgg 
tgaccaacaa 
tgaacaagat 
aggagccctt 



gtgggagaag 60 
gtgggccagc 120 
gggctgccgc 180 
gcgcagcctg 240 
ggacaccaag 300 
ccagcaggcc 360 
catcgtgcag 420 
cgcctgggtg 4 80 
cgccctgagc 540 
ccaccaggcc 600 
ccgcgtgcac 660 
cagcgacatc 720 
cccccccatc 780 
cgtgcggatg 840 
ccgcgactac 900 
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gtggaccgct tctacaagac cctgcgcgct gagcaggcca gccaggacgt gaagaactgg 960 
atgaccgaga ccctgctggt gcagaacgcc aaccccgact gcaagaccat cctgaaggct 1020 
ctcggccccg cggccaccct ggaggagatg atgaccgcct gccagggcgt gggcggcccc 1080 
ggccacaagg cccgcgtgct ggccgaggcg atgagccagg tgacgaaccc ggcgaccatc 114 0 
atgatgcagc gcggcaactt ccgcaaccag cggaagaccg tcaagtgctt caactgcggc 1200 
aaggagggcc acaccgccag gaactgccgc gccccccgca agaagggctg ctggcgctgc 1260 
ggccgcgagg gccaccagat gaaggactgc accgagcgcc aggccaactt cctgggcaag 1320 
atctggccca gctacaaggg ccgccccggc aacttcctgc agagccgccc cgagcccacc 1380 
gccccccccg aggagagctt ccgcttcggc gaggagaaga ccacccccag ccagaagcag 1440 
gagcccatcg acaaggagct gtaccccctg accagcctgc gcagcctgtt cggcaacgac 1500 
cccagcagcc agtcgacgaa tcctaaacct caaagaaaaa acaaacgtaa caccaaccgt 1560 
cgcccacagg acgtcaagtt cccgggtggc ggtcagatcg ttggtggagt ttacttgttg 1620 
ccgcgcaggg gccctagatt gggtgtgcgc gcgacgagaa agacttccga gcggtcgcaa 1680 
cctcgaggta gacgtcagcc tatccccaag gctcgtcggc ccgagggcag gacctgggct 174 0 
cagcccgggt acccttggcc cctctatggc aatgagggct gcgggtgggc gggatggctc 1800 
Gtgtctcccc gtggctctcg gcctagctgg ggccccacag acccccggcg taggtcgcgc 1860 
aatttgggta aggtcatcga tacccttacg tgcggcttcg ccgacctcat ggggtacata 1920 
ccgctcgtcg gcgcccctct tggaggcgct gccagggccc tggcgcatgg cgtccgggtt 1980 
ctggaagacg gcgtgaacta tgcaacaggg aaccttcctg gttgctctta g 2031 

<210> 8 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
HIV -Gag/HCV- Core fusion polypeptide 

<400> 8 

atgggtgcga gagcgtcggt attaagcggg ggagaattag ataaatggga aaaaattcgg 60 
ttaaggccag ggggaaagaa aaaatataag ttaaaacata tagtatgggc aagcagggag 120 
ctagaacgat tcgcagtcaa tcctggcctg ttagaaacat cagaaggctg cagacaaata 180 
ttgggacagc tacagccatc ccttcagaca ggatcagaag aacttagatc attatataat 240 
acagtagcaa ccctctattg tgtacatcaa aggatagatg taaaagacac caaggaagct 300 
ttagagaaga tagaggaaga gcaaaacaaa agtaagaaaa aggcacagca agcagcagct 360 
gcagctggca caggaaacag cagccaggtc agccaaaatt accctatagt gcagaaccta 420 
caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 480 
gtagaagaaa aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 540 
gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 600 
caaatgttaa aagagactat caatgaggaa gctgcagaat gggatagagt gcatccagtg 660 
catgcagggc ctattgcacc aggccaaatg agagaaccaa ggggaagtga catagcagga 720 
actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 780 
ggagaaatct ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 840 
cctaccagca ttctggacat aagacaagga ccaaaggaac cctttagaga ttatgtagac 900 
cggttctata aaactctaag agccgaacaa gcttcacagg atgtaaaaaa ttggatgaca 960 
gaaaccttgt tggtccaaaa tgcaaaccca gattgtaaga ctattttaaa agcattggga 1020 
ccagcagcta cactagaaga aatgatgaca gcatgtcagg gagtgggggg acccggccat 1080 
aaagcaagag ttttggctga agccatgagc caagtaacaa atccagctaa cataatgatg 114 0 
cagagaggca attttaggaa ccaaagaaag actgttaagt gtttcaattg tggcaaagaa 1200 
gggcacatag ccaaaaattg cagggcccct aggaaaaagg gctgttggag atgtggaagg 1260 
gaaggacacc aaatgaaaga ttgcactgag agacaggcta attttttagg gaagatctgg 1320 
ccttcctaca agggaaggcc agggaatttt cttcagagca gaccagagcc aacagcccca 1380 
ccagaagaga gcttcaggtt tggggaggag aaaacaactc cctctcagaa gcaggagccg 1440 
atagacaagg aactgtatcc tttaacttcc ctcagatcac tctttggcaa cgacccctcg 1500 
tcacagtcga cgaatcctaa acctcaaaga aaaaacaaac gtaacaccaa ccgtcgccca 1560 
caggacgtca agttcccggg tggcggtcag atcgttggtg gagtttactt gttgccgcgc 1620 
aggggcccta gattgggtgt gcgcgcgacg agaaagactt ccgagcggtc gcaacctcga 1680 
ggtagacgtc agcctatccc caaggctcgt cggcccgagg gcaggacctg ggctcagccc 1740 



7 



WO 00/39302 



PCT7US99/31245 



gggtaccctt ggcccctcta tggcaatgag ggctgcgggt gggcgggatg gctcctgtct 1800 

ccccgtggct ctcggcctag ctggggcccc acagaccccc ggcgtaggtc gcgcaatttg 1860 

ggtaaggtca tcgataccct tacgtgcggc ttcgccgacc tcatggggta cataccgctc 1920 

gtcggcgccc ctcttggagg cgctgccagg gccctggcgc atggcgtccg ggttctggaa 1980 

gacggcgtga actatgcaac agggaacctt cctggttgct cttag 2025 

<210> 9 
<211> 1268 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Gag 
common region 



<400> 9 

gccaccatgg 

atccgcctgc 

cgcgagctgg 

cagatcctgg 

tacaacaccg 

gaggccctgg 

gccgccgccg 

aacctgcagg 

aa ggtggtgg 

gagggcgcca 
gccatgcaga 
cccgtgcacg 
gccggcacca 
cccgtgggcg 
tacagcccca 
gtggaccgct 
atgaccgaga 
ctcggccccg 
ggccacaagg 
atgatgcagc 
aaggagggcc 
ggccgcga 



gcgcccgcgc 
gccccggcgg 
agcgcttcgc 
gccagctgca 
tggccaccct 
agaagatcga 
ccggcaccgg 
gccagatggt 
aggagaaggc 
ccccccagga 
tgctgaagga 
ccggccccat 
ccagcaccct 
agatctacaa 
ccagcatcct 
tctacaagac 
ccctgctggt 
cggccaccct 
cccgcgtgct 
gcggcaactt 
acaccgccag 



cagcgtgctg 
caagaagaag 
cgtgaacccc 
gcccagcctg 
gtactgcgtg 
ggaggagcag 
caacagcagc 
gcaccaggcc 
cttcagcccc 
cctgaacacg 
gaccatcaac 
cgcccccggc 
gcaggagcag 
gcggtggatc 
ggacatccgc 
cctgcgcgct 
gcagaacgcc 
ggaggagatg 
ggccgaggcg 
ccgcaaccag 
gaactgccgc 



agcggcggcg 
tacaagctga 
ggcctgctgg 
cagaccggca 
caccagcgca 
aacaagtcca 
caggtgagcc 
atcagccccc 
gaggtgatcc 
atgttgaaca 
gaggaggccg 
cagatgcgcg 
atcggctgga 
atcctgggcc 
cagggcccca 
gagcaggcca 
aaccccgact 
atgaccgcct 
atgagccagg 
cggaagaccg 
gccccccgca 



agctggacaa 
agcacatcgt 
agaccagcga 
gcgaggagct 
tcgacgtcaa 
agaagaaggc 
agaactaccc 
gcaccctgaa 
ccatgttcag 
ccgtgggcgg 
ccgagtggga 
agccccgcgg 
tgaccaacaa 
tgaacaagat 
aggagccctt 
gccaggacgt 
gcaagaccat 
gccagggcgt 
tgacgaaccc 
tcaagtgctt 
agaagggctg 



gtgggagaag 6 0 
9tgggccagc 12 0 
99gctgccgc 180 
gcgcagcctg 240 
ggacaccaag 300 
ccagcaggcc 360 
catcgtgcag 420 
cgcctgggtg 480 
cgccctgagc 540 
ccaccaggcc 600 
ccgcgtgcac 660 
cagcgacatc 720 
cccccccatc 780 
cgtgcggatg 840 
ccgcgactac 900 
gaagaactgg 960 
cctgaaggct 1020 
gggcggcccc 1080 
ggcgaccatc 1140 
caactgcggc 1200 
ctggcgctgc 1260 
1268 



<210> 10 
<211> 20 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: HIV-Gag 
peptide p7G 

<400> 10 

Gly Gly His Gin Ala Ala Met Gin Met Leu Lys Glu Thr lie Asn Glu 
1 5 .10 15 

Glu Ala Ala Glu 
20 



<210> 11 
<211> 30 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer GAGS 



<400> 11 

aagaattcca tgggtgcgag agcgtcggta 



30 



<210> 12 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
p55-SAL3 

<400> 12 

attcgtcgac tgtgacgagg ggtcgttgcc 30 

<210> 13 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 



<210> 14 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 173CORE 
<400> 14 

tattggatcc taagagcaac caggaaggtt c 31 

<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer MS65 



CORESAL5 



<400> 13 



atttgtcgac gaatcctaaa cctcaaagaa aaac 



34 



<400> 15 

cgaccatcat ggatgcagcg c 



21 



<210> 16 
<211> 30 
<212> DNA 



<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: primer MS66 



<400> 16 

aggattcgtc gagtcgctgc tggggtcgtt 

<210> 17 

<211> 26 

<212> DNA 

<213> Artificial Sequence 



30 



<220> 

<223> Description of Artificial Sequence: primer XPANXNF 
<400> 17 

gcacgtgggc ccggcgcctc tagagc 26 

<210> 18 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer XPANXNR 
<400> 18 

gctctagagg cgccgggccc acgtgc 26 

<210> 19 
<211> 20 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: HIV p55 Gag 
Major Homology Region 

<400> 19 

Asp lie Arg Gin Gly Pro Lys Glu Pro Phe Arg Asp Tyr Val Asp Arg 
15 10 15 

Phe Tyr Lys Thr 
20 



<210> 20 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic p55 
Gag Major Homology Region 

<400> 20 

gacatccgcc agggccccaa ggagcccttc cgcgactacg tggaccgctt ctacaagacc 60 

<210> 21 
<211> 15 
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<212> PRT 

<213> Human immunodeficiency virus 
<400> 21 

Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg 
15 10 15 



<210> 22 
<211> 5 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 22 

Lys Ala Lys Arg Arg 
1 5 



<210> 23 
<211> 4 
<212> PRT 

<213> Human immunodeficiency virus 

<400> 23 
Arg Glu Lys Arg 
1 



<210> 24 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of 
mut7.SF162 cleavage site 

<400> 24 

Ala Pro Thr Lys Ala He Ser Ser Val Val Gin Ser Glu Lys Ser 
15 10 15 



<210> 25 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of 
mut8.SF162 cleavage site 

<400> 25 

Ala Pro Thr He Ala He Ser Ser Val Val Gin Ser Glu Lys Ser 
1-5 10 15 



<210> 26 
<211> 15 
<212> PRT 
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<220> 

<223> Description of Artificial Sequence: aa of 
mut.SF162 cleavage site 

<400> 26 

Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Ser 
15 10 15 



<210> 27 
<211> 15 
<212> PRT 

<213> Human immunodeficiency virus 
<220> 

<223> Description of Artificial Sequence: aa of native 
cleavage site in US 4 

<400> 27 

Ala Pro Thr Gin Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg 
1 5 10 15 



<210> 28 
<211> 5 
<212> PRT 

<213> Human immunodeficiency virus 
<220> 

<223> Description of Artificial Sequence: aa of second 
cleavage site in US4 

<400> 28 

Gin Ala Lys Arg Arg 
1 5 



<210> 29 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: aa of mut.US4 
cleavage site 

<400> 29 

Ala Pro Thr Gin Ala Lys Arg Arg Val Val Gin Arg Glu Lys Ser 
1 5 io is 



<210> 30 
<211> 1419 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 30 
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gtagaaaaat 
actctatttt 
acacatgcct 
gaaaatttta 
ttatgggatc 
tgcactaatt 
ggagaaataa 
gaatatgcac 
aaattgataa 
ccaattccca 
aagttcaatg 
aggccagtag 
attagatctg 
gtagaaatta 
gggagagcat 
attagtggag 
tttgggaata 
cacagtttta 
acttggaata 
ataaaacaaa 
agaggacaaa 
aaagagatca 
tggagaagtg 
accaaggcaa 



tgtgggtcac 
gtgcatcaga 
gtgtacccac 
acatgtggaa 
aaagtctaaa 
tgaagaatgc 
aaaattgctc 
ttttttataa 
attgtaacac 
tacattattg 
gatcaggacc 
tgtcaactca 
aaaatttcac 
attgtacaag 
tttatgcaac 
aaaaatggaa 
aaacaatagt 
attgtggagg 
atactatagg 
ttataaacag 
ttagatgctc 
gtaacaccac 
aattatataa 
agagaagagt 



agtctattat 
tgctaaagcc 
agaccctaac 
aaataacatg 
gccatgtgta 
tactaatacc 
tttcaaggtc 
acttgatgta 
ctcagtcatt 
tgccccggct 
atgtacaaat 
attgctgtta 
agacaatgct 
acctaacaat 
aggagacata 
taacacttta 
ctttaagcaa 
ggaatttttc 
gccaaataac 
gtggcaggaa 
atcaaatatt 
cgagatcttc 
atataaagta 
ggtgcagaga 



ggggtacctg 
tatgacacag 
ccacaagaaa 
gtagaacaga 
aagttaaccc 
aagagtagta 
accacaagca 
gtaccaatag 
acacaggcct 
ggttttgcga 
gtcagcacag 
aatggcagtc 
aaaactataa 
aatacaagaa 
ataggagata 
aaacagatag 
tcctcaggag 
tactgtaatt 
actaatggaa 
gtaggaaaag 
acaggactgc 
agacctggag 
gtaaaaattg 
gaaaaaaga 



tgtggaaaga 
aggtacataa 
tagtattgga 
tgcatgagga 
cactctgtgt 
attggaaaga 
taagaaataa 
ataatgataa 
gtccaaaggt 
ttctaaagtg 
tacaatgtac 
tagcagaaga 
tagtacagct 
aaagtataac 
taagacaagc 
ttacaaaatt 
gggacccaga 
caacacagct 
ctatcacact 
caatgtatgc 
tattaacaag 
gtggagatat 
agccattagg 



agcaaccacc 60 
tgtctgggcc 120 
aaatgtgaca 180 
tataatcagt 240 
tactctacat 300 
gatggacaga 360 
gatgcagaaa 420 
tacaagctat 480 
atcctttgaa 540 
taatgataag 600 
acatggaatt 660 
aggggtagta 720 
gaaggaatct 780 
tataggaccg 840 
acattgtaac 900 
acaagcacaa 960 
aattgtaatg 1020 
ttttaatagt 1080 
cccatgcaga 114 0 
ccctcccatc 1200 
agatggtggt 1260 
gagggacaat 1320 
agtagcaccc 1380 
1419 



<210> 31 
<211> 1932 
<212> DNA 

<213> Human immunodeficiency virus 



<400> 31 

gtagaaaaat 

actctatttt 

acacatgcct 

gaaaatttta 

ttatgggatc 

tgcactaatt 

ggagaaataa 

gaatatgcac 

aaattgataa 

ccaattccca 

aagttcaatg 

aggccagtag 

attagatctg 

gtagaaatta 

gggagagcat 

attagtggag 

tttgggaata 

cacagtttta 

acttggaata 

ataaaacaaa 

agaggacaaa 

aaagagatca 

tggagaagtg 

accaaggcaa 

ttccttgggt 

gtacaggcca 

attgaggcgc 

agagtcctgg 



tgtgggtcac 
gtgcatcaga 
gtgtacccac 
acatgtggaa 
aaagtctaaa 
tgaagaatgc 
aaaattgctc 
ttttttataa 
attgtaacac 
tacattattg 
gatcaggacc 
tgtcaactca 
aaaatttcac 
attgtacaag 
tttatgcaac 
aaaaatggaa 
aaacaatagt 
attgtggagg 
atactatagg 
ttataaacag 
ttagatgctc 
gtaacaccac 
aattatataa 
agagaagagt 
tcttgggagc 
gacaattatt 
aacagcatct 
ctgtggaaag 



agtctattat 
tgctaaagcc 
agaccctaac 
aaataacatg 
gccatgtgta 
tactaatacc 
tttcaaggtc 
acttgatgta 
ctcagtcatt 
tgccccggct 
atgtacaaat 
attgctgtta 
agacaatgct 
acctaacaat 
aggagacata 
taacacttta 
ctttaagcaa 
ggaatttttc 
gccaaataac 
gtggcaggaa 
atcaaatatt 
cgagatcttc 
atataaagta 
ggtgcagaga 
agcaggaagc 
gtctggtata 
gttgcaactc 
atacctaaag 



ggggtacctg 
tatgacacag 
ccacaagaaa 
gtagaacaga 
aagttaaccc 
aagagtagta 
accacaagca 
gtaccaatag 
acacaggcct 
ggttttgcga 
gtcagcacag 
aatggcagtc 
aaaactataa 
aatacaagaa 
ataggagata 
aaacagatag 
tcctcaggag 
tactgtaatt 
actaatggaa 
gtaggaaaag 
acaggactgc 
agacctggag 
gtaaaaattg 
gaaaaaagag 
actatgggcg 
gtgcaacagc 
acagtctggg 
gatcaacagc 



tgtggaaaga 
aggtacataa 
tagtattgga 
tgcatgagga 
cactctgtgt 
attggaaaga 
taagaaataa 
ataatgataa 
gtccaaaggt 
ttctaaagtg 
tacaatgtac 
tagcagaaga 
tagtacagct 
aaagtataac 
taagacaagc 
ttacaaaatt 
gggacccaga 
caacacagct 
ctatcacact 
caatgtatgc 
tattaacaag 
gtggagatat 
agccattagg 
cagtgacgct 
cacggtcact 
agaacaattt 
gcatcaagca 
tcctagggat 



agcaaccacc 60 
tgtctgggcc 120 
aaatgtgaca 180 
tataatcagt 240 
tactctacat 300 
gatggacaga 360 
gatgcagaaa 420 
tacaagctat 480 
atcctttgaa 540 
taatgataag 600 
acatggaatt 660 
aggggtagta 720 
gaaggaatct 780 
tataggaccg 84 0 
acattgtaac 900 
acaagcacaa 960 
aattgtaatg 1020 
ttttaatagt 1080 
cccatgcaga 114 0 
ccctcccatc 1200 
agatggtggt 1260 
gagggacaat 1320 
agtagcaccc 1380 
aggagctatg 1440 
gacgctgacg 1500 
gctgagagct 1560 
gctccaggca 1620 
ttggggttgc 1680 
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tctggaaaac tcatttgcac cactgctgtg ccttggaatg ctagttggag taataaatct 1740 

ctggatcaga tttggaataa catgacctgg acggagtggg agagagaaat tgacaattac 1800 

acaaacttaa tatacacctt aattgaagaa tcgcagaacc aacaagaaaa gaatgaacaa 1860 

gaattattag aattggataa gtgggcaagt ttgtggaatt ggtttgacat atcaaaatgg 1920 

ctgtggtata ta 1932 

<210> 32 
<211> 2457 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 32 

gtagaaaaat tgtgggtcac agtctattat ggggtacctg tgtggaaaga agcaaccacc 60 
actctatttt gtgcaccaga tgctaaagcc tatgacacag aggtacataa tgtctgggcc 120 
acacatgcct gtgtacccac agaccctaac ccacaagaaa tagtattgga aaatgtgaca 180 
gaaaatttta acatgtggaa aaataacatg gtagaacaga tgcatgagga tataatcagt 240 
ttatgggatc aaagtctaaa gccatgtgta aagttaaccc cactctgtgt tactctacat 300 
tgcactaatt tgaagaatgc tactaatacc aagagtagta attggaaaga gatggacaga 360 
ggagaaataa aaaattgctc tttcaaggtc accacaagca taagaaataa gatgcagaaa 420 
gaatatgcac ttttttataa acttgatgta gtaccaatag ataatgataa tacaagctat 480 
aaattgataa attgtaacac ctcagtcatt acacaggcct gtccaaaggt atcctttgaa 540 
ccaattccca tacattattg tgccccggct ggttttgcga ttctaaagtg taatgataag 600 
aagttcaatg gatcaggacc atgtacaaat gtcagcacag tacaatgtac acatggaatt 660 
aggccagtag tgtcaactca attgctgtta aatggcagtc tagcagaaga aggggtagta 720 
attagatctg aaaatttcac agacaatgct aaaactataa tagtacagct gaaggaatct 780 
gtagaaatta attgtacaag acctaacaat aatacaagaa aaagtataac tataggaccg 840 
gggagagcat tttatgcaac aggagacata ataggagata taagacaagc acattgtaac 900 
attagtggag aaaaatggaa taacacttta aaacagatag ttacaaaatt acaagcacaa 960 
tttgggaata aaacaatagt ctttaagcaa tcctcaggag gggacccaga aattgtaatg 1020 
cacagtttta attgtggagg ggaatttttc tactgtaatt caacacagct ttttaatagt 1080 
acttggaata atactatagg gccaaataac actaatggaa ctatcacact cccatgcaga 1140 
ataaaacaaa ttataaacag gtggcaggaa gtaggaaaag caatgtatgc ccctcccatc 1200 
agaggacaaa ttagatgctc atcaaatatt acaggactgc tattaacaag agatggtggt 1260 
aaagagatca gtaacaccac cgagatcttc agacctggag gtggagatat gagggacaat 1320 
tggagaagtg aattatataa atataaagta gtaaaaattg agccattagg agtagcaccc 1380 
accaaggcaa agagaagagt ggtgcagaga gaaaaaagag cagtgacgct aggagctatg 1440 
ttccttgggt tcttgggagc agcaggaagc actatgggcg cacggtcact gacgctgacg 1500 
gtacaggcca gacaattatt gtctggtata gtgcaacagc agaacaattt gctgagagct 1560 
attgaggcgc aacagcatct gttgcaactc acagtctggg gcatcaagca gctccaggca 1620 
agagtcctgg ctgtggaaag atacctaaag gatcaacagc tcctagggat ttggggttgc 1680 
tctggaaaac tcatttgcac cactgctgtg ccttggaatg ctagttggag taataaatct 1740 
ctggatcaga tttggaataa catgacctgg atggagtggg agagagaaat tgacaattac 1800 
acaaacttaa tatacacctt aattgaagaa tcgcagaacc aacaagaaaa gaatgaacaa 1860 
gaattattag aattggataa gtgggcaagt ttgtggaatt ggtttgacat atcaaaatgg 1920 
ctgtggtata taaaaatatt cataatgata gtaggaggtt tagtaggttt aaggatagtt 1980 
tttactgtgc tttctatagt gaatagagtt aggcagggat actcaccatt atcatttcag 204 0 
acccgcttcc cagccccaag gggacccgac aggcccgaag gaatcgaaga agaaggtgga 2100 
gagagagaca gagacagatc cagtccatta gtgcatggat tattagcact catctgggac 2160 
gatctacgga gcctgtgcct cttcagctac caccgcttga gagacttaat cttgattgca 2220 
gcgaggattg tggaacttct gggacgcagg gggtgggaag ccctcaagta ttgggggaat 2280 
ctcctgcagt attggattca ggaactaaag aatagtgctg ttagtttgtt tgatgccata 2340 
gctatagcag tagctgaggg gacagatagg attatagaag tagcacaaag aattggtaga 2400 
gcttttctcc acatacctag aagaataaga cagggctttg aaagggcttt gctataa 2457 

<210> 33 

<211> 1453 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: gpl20 . modSF162 
<400> 33 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 1260 

aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 1320 

ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 8 0 

ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 

atcgagcccc tgg 1453 

<210> 34 
<211> 1387 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gp!20 . modSF162 . delV2 

<400> 34 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 4 80 

ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 54 0 

gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 

aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 

atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 

gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 

agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 

cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 

aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 

cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 

atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 

agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
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cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 12 00 

atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 

ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 132 0 

aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 1380 

cccacca 1387 

<210> 35 
<211> 1323 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl20 . modSF162 .delV!V2 

<400> 35 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggcgc cggcaactgc cagaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 54 0 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 720 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 84 0 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 102 0 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgctaactc 132 0 
gag 1323 

<210> 36 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl4 0 . modSF162 
<400> 36 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
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gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 1260 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 1320 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 1380 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgggcgtggc ccccaccaag gccaagcgcc gcgtggtgca gcgcgagaag 1500 
cgcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 1620 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 174 0 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag i860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 1920 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
aactggttcg acatcagcaa gtggctgtgg tacatctaac tcgag 2025 

<210> 37 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl4 0 . modSF162 . del V2 

<400> 37 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 480 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 54 0 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 840 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 1320 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 1380 
cccaccaagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtgac cctgggcgcc 1440 
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atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 

accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 

gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 162 0 

gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 

tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 174 0 

agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 

tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 

caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 1920 

tggctgtggt acatctaact cgag 1944 

<210> 38 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40.modSF162 .delVl/V2 

<400> 38 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 42 0 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 4 80 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 540 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 132 0 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccaagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtgac cctgggcgcc 144 0 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1620 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 174 0 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc ( tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 1920 
tggctgtggt acatctaact cgag 1944 

<210> 39 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: 
gpl4 0 .mut ,modSF162 



<400> 39 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
agcatccgca acaagatgca gaaggagtac 
atcgacaacg acaacaccag ctacaagctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaacga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggagggcgt ggtgatccgc 
atcatcgtgc agctgaagga gagcgtggag 
cgcaagagca tcaccatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
aaggccatgt acgccccccc catccgcggc 
ctgctgctga cccgcgacgg cggcaaggag 
ggcggcggcg acatgcgcga caactggcgc 
atcgagcccc tgggcgtggc ccccaccaag 
agcgccgtga ccctgggcgc catgttcctg 
ggcgcccgca gcctgaccct gaccgtgcag 
cagcagaaca acctgctgcg cgccatcgag 
tggggcatca agcagctgca ggcccgcgtg 
cagctgctgg gcatctgggg ctgcagcggc 
aacgccagct ggagcaacaa gagcctggac 
tgggagcgcg agatcgacaa ctacaccaac 
aaccagcagg agaagaacga gcaggagctg 
aactggttcg acatcagcaa gtggctgtgg 

<210> 40 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
aacctgaaga acgccaccaa caccaagagc 420 
atcaagaact gcagcttcaa ggtgaccacc 480 
gccctgttct acaagctgga cgtggtgccc 540 
atcaactgca acaccagcgt gatcacccag 600 
cccatccact actgcgcccc cgccggcttc 660 
aacggcagcg gcccctgcac caacgtgagc 72 0 
gtggtgagca cccagctgct gctgaacggc 780 
agcgagaact tcaccgacaa cgccaagacc 84 0 
atcaactgca cccgccccaa caacaacacc 900 
gccttctacg ccaccggcga catcatcggc 960 
ggcgagaagt ggaacaacac cctgaagcag 1020 
aacaagacca tcgtgttcaa gcagagcagc 1080 
ttcaactgcg gcggcgagtt cttctactgc 1140 
aacaacacca tcggccccaa caacaccaac 1200 
cagatcatca accgctggca ggaggtgggc 1260 
cagatccgct gcagcagcaa catcaccggc 1320 
atcagcaaca ccaccgagat cttccgcccc 1380 
agcgagctgt acaagtacaa ggtggtgaag 1440 
gccaagcgcc gcgtggtgca gcgcgagaag 1500 
ggcttcctgg gcgccgccgg cagcaccatg 1560 
gcccgccagc tgctgagcgg catcgtgcag 1620 
gcccagcagc acctgctgca gctgaccgtg 1680 
ctggccgtgg agcgctacct gaaggaccag 1740 
aagctgatct gcaccaccgc cgtgccctgg 1800 
cagatctgga acaacatgac ctggatggag 1860 
ctgatctaca ccctgatcga ggagagccag 1920 
ct 99 a 9 ct gg acaagtgggc cagcctgtgg I960 
tacatctaac tcgag 2025 



<220> 

<223> Description of Artificial Sequence: 
gpl4 0 . mut . modSFl 62 . delV2 



<400> 40 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
ggcaagctga tcaactgcaa caccagcgtg 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
aacctgaaga acgccaccaa caccaagagc 420 
atcaagaact gcagcttcaa ggtgggcgcc 480 
atcacccagg cctgccccaa ggtgagcttc 540 
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gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 

aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 

atccgccccg tggtgagcac ccagccgctg ctgaacggca gcctggccga ggagggcgtg 720 

gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 

agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 840 

cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 

aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 

cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 

atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 

agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 

cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 

atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 

ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 1320 

aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 1380 

cccaccaagg ccaagcgccg cgtggtgcag cgcgagaaga gcgccgtgac cctgggcgcc 1440 

atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 

accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 

gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1620 

gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 

tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 1740 

agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 

tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 

caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 1920 

tggctgtggt acatctaact cgag 1944 

<210> 41 
<211> 1836 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40.mut .modSF162 .delVl/V2 

<400> 41 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgggcgc cggcaactgc cagaccagcg tgatcaccca ggcctgcccc 42 0 

aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 

aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 54 0 

tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 

gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 

cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 72 0 

atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 

caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 84 0 

aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 

cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 

cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 

accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 

tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 114 0 

acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 

gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 

ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gagcgccgtg 1320 

accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1380 

agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1440 
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aacctgctgc gcgccatcga ggcccagcag cacctgctgc 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg 
tggagcaaca agagcctgga ccagatctgg aacaacatga 
gagatcgaca actacaccaa cctgatctac accctgatcg 
gagaagaacg agcaggagct gctggagctg gacaagtggg 
gacatcagca agtggctgtg gtacatctaa ctcgag 

<210> 42 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl4 0 .mut7 . mods Fl 6 2 

<400> 42 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
99 c 99cgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 114 0 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 1260 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 1320 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 1380 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 144 0 
atcgagcccc tgggcgtggc ccccaccaag gccatcagca gcgtggtgca gagcgagaag 1500 
agcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 1620 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
tggggcatca agcagctgca ggcccgcgtg ctggccgtgg agcgctacct gaaggaccag 174 0 
cagctgctgg gcatctgggg ctgcagcggc aagctgatct gcaccaccgc cgtgccctgg 1800 
aacgccagct ggagcaacaa gagcctggac cagatctgga acaacatgac ctggatggag 1860 
tgggagcgcg agatcgacaa ctacaccaac ctgatctaca ccctgatcga ggagagccag 1920 
aaccagcagg agaagaacga gcaggagctg ctggagctgg acaagtgggc cagcctgtgg 1980 
aactggttcg acatcagcaa gtggctgtgg tacatctaac tcgag 2025 

<210> 43 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 
<220> 



agctgaccgt gtggggcatc 1500 
tgaaggacca gcagctgctg 1560 
ccgtgccctg gaacgccagc 1620 
cctggatgga gtgggagcgc 1680 
aggagagcca gaaccagcag 1740 
ccagcctgtg gaactggttc 1800 

1836 
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<223> Description of Artificial Sequence: 
gpl40 .mut7 .modSF162 .delV2 



<400> 43 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

agcaactgga 

ggcaagctga 

gagcccatcc 

aagaagttca 

atccgccccg 

gtgatccgca 

agcgtggaga 

cccggccgcg 

aacatcagcg 

cagttcggca 

atgcacagct 

agcacctgga 

cgcatcaagc 

atccgcggcc 

ggcaaggaga 

aactggcgca 

cccaccaagg 

atgttcctgg 

accgtgcagg 

gccatcgagg 

gcccgcgtgc 

tgcagcggca 

agcctggacc 

tacaccaacc 

caggagctgc 

tggctgtggt 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgaccct 
aggagatgga 
tcaactgcaa 
ccatccacta 
acggcagcgg 
tggtgagcac 
gcgagaactt 
tcaactgcac 
ccttctacgc 
gcgagaagtg 
acaagaccat 
tcaactgcgg 
acaacaccat 
agatcatcaa 
agatccgctg 
tcagcaacac 
gcgagctgta 
ccatcagcag 
gcttcctggg 
cccgccagct 
cccagcagca 
tggccgtgga 
agctgatctg 
agatctggaa 
tgatctacac 
tggagctgga 
acatctaact 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gcactgcacc 
ccgcggcgag 
caccagcgtg 
ctgcgccccc 
cccctgcacc 
ccagctgctg 
caccgacaac 
ccgccccaac 
caccggcgac 
gaacaacacc 
cgtgttcaag 
cggcgagttc 
cggccccaac 
ccgctggcag 
cagcagcaac 
caccgagatc 
caagtacaag 
cgtggtgcag 
cgccgccggc 
gctgagcggc 
cctgctgcag 
gcgctacctg 
caccaccgcc 
caacatgacc 
cctgatcgag 
caagtgggcc 
cgag 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
aacctgaaga 
atcaagaact 
atcacccagg 
gccggcttcg 
aacgtgagca 
ctgaacggca 
gccaagacca 
aacaacaccc 
atcatcggcg 
ctgaagcaga 
cagagcagcg 
ttctactgca 
aacaccaacg 
gaggtgggca 
atcaccggcc 
ttccgccccg 
gtggtgaaga 
agcgagaaga 
agcaccatgg 
atcgtgcagc 
ctgaccgtgt 
aaggaccagc 
gtgccctgga 
tggatggagt 
gagagccaga 
agcctgtgga 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
acgccaccaa 
gcagcttcaa 
cctgccccaa 
ccatcctgaa 
ccgtgcagtg 
gcctggccga 
tcatcgtgca 
gcaagagcat 
acatccgcca 
tcgtgaccaa 
gcggcgaccc 
acagcaccca 
gcaccatcac 
aggccatgta 
tgctgctgac 
gcggcggcga 
tcgagcccct 
gcgccgtgac 
gcgcccgcag 
agcagaacaa 
ggggcatcaa 
agctgctggg 
acgccagctg 
gggagcgcga 
accagcagga 
actggttcga 



gctgtgtgga 60 
ctacggcgtg 120 
ggcctacgac 180 
caacccccag 240 
catggtggag 300 
cgtgaagctg 360 
caccaagagc 420 
ggtgggcgcc 4 80 
ggtgagcttc 540 
gtgcaacgac 600 
cacccacggc 660 
ggagggcgtg 72 0 
gctgaaggag 780 
caccatcggc 840 
ggcccactgc 900 
gctgcaggcc 960 
cgagatcgtg 1020 
gctgttcaac 1080 
cctgccctgc 1140 
cgcccccccc 1200 
ccgcgacggc 12 6 0 
catgcgcgac 1320 
gggcgtggcc 1380 
cctgggcgcc 144 0 
cctgaccctg 1500 
cctgctgcgc 1560 
gcagctgcag 1620 
catctggggc 1680 
gagcaacaag 1740 
gatcgacaac 1800 
gaagaacgag 1860 
catcagcaag 1920 
1944 



<210> 44 
<211> 1836 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence 
gp!4 0 .mut7 .modSF162 .delVl/V2 



<400> 44 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

aaggtgagct 

aagtgcaacg 

tgcacccacg 

gaggagggcg 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgggcgc 
tcgagcccat 
acaagaagtt 
gcatccgccc 
tggtgatccg 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
cggcaactgc 
ccccatccac 
caacggcagc 
cgtggtgagc 
cagcgagaac 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
cagaccagcg 
tactgcgccc 
ggcccctgca 
acccagctgc 
ttcaccgaca 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
tgatcaccca 
ccgccggctt 
ccaacgtgag 
tgctgaacgg 
acgccaagac 



gctgtgtgga 60 
ctacggcgtg 120 
ggcctacgac 180 
caacccccag 24 0 
catggtggag 300 
cgtgaagctg 360 
ggcctgcccc 420 
cgccatcctg 480 
caccgtgcag 540 
cagcctggcc 600 
catcatcgtg 660 
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cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 720 

atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 

caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 84 0 

aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 

cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 

cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 

accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 

tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 114 0 

acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 

gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 

ctgggcgtgg cccccaccaa ggccatcagc agcgtggtgc agagcgagaa gagcgccgtg 132 0 

accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 13 80 

agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 144 0 

aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 

aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 

ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1620 

tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 

gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 

gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 

gacatcagca agtggctgtg gtacatctaa ctcgag 1836 

<210> 45 
<211> 2025 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .mut8 ,modSF162 

<400> 45 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 12 0 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 18 0 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 36 0 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 54 0 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 72 0 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 102 0 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 1260 
aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 1320 
ctgctgctga cccgcgacgg cggcaaggag atcagcaaca ccaccgagat cttccgcccc 13 80 
ggcggcggcg acatgcgcga caactggcgc agcgagctgt acaagtacaa ggtggtgaag 14 4 0 
atcgagcccc tgggcgtggc ccccaccatc gccatcagca gcgtggtgca gagcgagaag 1500 
agcgccgtga ccctgggcgc catgttcctg ggcttcctgg gcgccgccgg cagcaccatg 1560 
ggcgcccgca gcctgaccct gaccgtgcag gcccgccagc tgctgagcgg catcgtgcag 1620 
cagcagaaca acctgctgcg cgccatcgag gcccagcagc acctgctgca gctgaccgtg 1680 
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tggggcatca agcagctgca ggcccgcgtg 
cagctgctgg gcatctgggg ctgcagcggc 
aacgccagct ggagcaacaa gagcctggac 
tgggagcgcg agatcgacaa ctacaccaac 
aaccagcagg agaagaacga gcaggagctg 
aactggttcg acatcagcaa gtggctgtgg 



ctggccgtgg agcgctacct gaaggaccag 1740 

aagctgatct gcaccaccgc cgtgccctgg 1800 

cagatctgga acaacatgac ctggatggag 1860 

ctgatctaca ccctgatcga ggagagccag 1920 

ctggagctgg acaagtgggc cagcctgtgg 1980 

tacatctaac tcgag 2025 



<210> 46 
<211> 1944 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl40 .mutS .modSF162 .delV2 

<400> 46 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 4 80 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 54 0 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 84 0 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 1320 
aactggcgca gcgagctgta caagtacaag gtggtgaaga tcgagcccct gggcgtggcc 13 80 
cccaccatcg ccatcagcag cgtggtgcag agcgagaaga gcgccgtgac cctgggcgcc 144 0 
atgttcctgg gcttcctggg cgccgccggc agcaccatgg gcgcccgcag cctgaccctg 1500 
accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1560 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1620 
gcccgcgtgc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 1680 
tgcagcggca agctgatctg caccaccgcc gtgccctgga acgccagctg gagcaacaag 174 0 
agcctggacc agatctggaa caacatgacc tggatggagt gggagcgcga gatcgacaac 1800 
tacaccaacc tgatctacac cctgatcgag gagagccaga accagcagga gaagaacgag 1860 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcagcaag 1920 
tggctgtggt acatctaact cgag 1944 

<210> 47 
<211> 1836 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl40 .mut8 .modSF162 .delVl/V2 
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<400> 47 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggcgc cggcaactgc cagaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 720 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 840 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 • 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccat cgccatcagc agcgtggtgc agagcgagaa gagcgccgtg 1320 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1380 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 144 0 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 162 0 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 
gacatcagca agtggctgtg gtacatctaa ctcgag 1836 

<210> 48 
<211> 2547 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl60 . modSF162 
<400> 48 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 84 0 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
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atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
aaggccatgt acgccccccc catccgcggc 
ctgctgctga cccgcgacgg cggcaaggag 
ggcggcggcg acatgcgcga caactggcgc 
atcgagcccc tgggcgtggc ccccaccaag 
cgcgccgtga ccctgggcgc catgttcctg 
ggcgcccgca gcctgaccct gaccgtgcag 
cagcagaaca acctgctgcg cgccatcgag 
tggggcatca agcagctgca ggcccgcgtg 
cagctgctgg gcatctgggg ctgcagcggc 
aacgccagct ggagcaacaa gagcctggac 
tgggagcgcg agatcgacaa ctacaccaac 
aaccagcagg agaagaacga gcaggagctg 
aactggttcg acatcagcaa gtggctgtgg 
ggcctggtgg gcctgcgcat cgtgttcacc 
ggctacagcc ccctgagctt ccagacccgc 
gagggcatcg , aggaggaggg cggcgagcgc 
ggcctgctgg ccctgatctg ggacgacctg 
ctgcgcgacc tgatcctgat cgccgcccgc 
gaggccctga agtactgggg caacctgctg 
gccgtgagcc tgttcgacgc catcgccatc 
gaggtggccc agcgcatcgg ccgcgccttc 
ttcgagcgcg ccctgctgta actcgag 



aacaagacca tcgtgttcaa gcagagcagc 1080 
ttcaaccgcg gcggcgagtt cttctactgc 1140 
aacaacacca tcggccccaa caacaccaac 1200 
cagatcatca accgctggca ggaggtgggc 1260 
cagatccgct gcagcagcaa catcaccggc 1320 
atcagcaaca ccaccgagat cttccgcccc 1380 
agcgagctgt acaagtacaa ggtggtgaag 144 0 
gccaagcgcc gcgtggtgca gcgcgagaag 1500 
ggcttcctgg gcgccgccgg cagcaccatg 1560 
gcccgccagc tgctgagcgg catcgtgcag 1620 
gcccagcagc acctgctgca gctgaccgtg 1680 
ctggccgtgg agcgctacct gaaggaccag 174 0 
aagctgatct gcaccaccgc cgtgccctgg 1800 
cagatctgga acaacatgac ctggatggag 1860 
ctgatctaca ccctgatcga ggagagccag 1920 
ctggagctgg acaagtgggc cagcctgtgg 1980 
tacatcaaga tcttcatcat gatcgtgggc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag 2100 
ttccccgccc cccgcggccc cgaccgcccc 2160 
gaccgcgacc gcagcagccc cctggtgcac 2220 
cgcagcctgt gcctgttcag ctaccaccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg 2340 
cagtactgga tccaggagct gaagaacagc 2400 
gccgtggccg agggcaccga ccgcatcatc 2460 
ctgcacatcc cccgccgcat ccgccagggc 2520 

2547 



<210> 49 

<211> 2466 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl6 0 .modSF162 .delV2 



<400> 49 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgggcgcc 4 80 
ggcaagctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 540 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaacgac 600 
aagaagttca acggcagcgg cccctgcacc aacgtgagca ccgtgcagtg cacccacggc 660 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggagggcgtg 720 
gtgatccgca gcgagaactt caccgacaac gccaagacca tcatcgtgca gctgaaggag 780 
agcgtggaga tcaactgcac ccgccccaac aacaacaccc gcaagagcat caccatcggc 840 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 900 
aacatcagcg gcgagaagtg gaacaacacc ctgaagcaga tcgtgaccaa gctgcaggcc 960 
cagttcggca acaagaccat cgtgttcaag cagagcagcg gcggcgaccc cgagatcgtg 1020 
atgcacagct tcaactgcgg cggcgagttc ttctactgca acagcaccca gctgttcaac 1080 
agcacctgga acaacaccat cggccccaac aacaccaacg gcaccatcac cctgccctgc 1140 
cgcatcaagc agatcatcaa ccgctggcag gaggtgggca aggccatgta cgcccccccc 1200 
atccgcggcc agatccgctg cagcagcaac atcaccggcc tgctgctgac ccgcgacggc 1260 
ggcaaggaga tcagcaacac caccgagatc ttccgccccg gcggcggcga catgcgcgac 1320 
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aactggcgca 
cccaccaagg 
atgttcctgg 
accgtgcagg 
gccatcgagg 
gcccgcgtgc 
tgcagcggca 
agcctggacc 
tacaccaacc 
caggagctgc 
tggctgtggt 
gtgttcaccg 
cagacccgct 
ggcgagcgcg 
gacgacctgc 
gccgeccgca 
aacctgctgc 
atcgccatcg 
cgcgccttcc 
ctcgag 



gcgagctgta 
ccaagcgccg 
gcttcctggg 
cccgccagct 
cccagcagca 
tggccgtgga 
agctgatctg 
agatctggaa 
tgatctacac 
tggagctgga 
acatcaagat 
tgctgagcat 
tccccgcccc 
accgcgaccg 
gcagcctgtg 
tcgtggagct 
agtactggat 
ccgtggccga 
tgcacatccc 



caagtacaag 
cgtggtgcag 
cgccgccggc 
gctgagcggc 
cctgctgcag 
gcgctacctg 
caccaccgcc 
caacatgacc 
cctgatcgag 
caagtgggcc 
cttcatcatg 
cgtgaaccgc 
ccgcggcccc 
cagcagcccc 
cctgttcagc 
gctgggccgc 
ccaggagctg 
gggcaccgac 
ccgccgcatc 



gtggtgaaga 
cgcgagaagc 
agcaccatgg 
atcgtgcagc 
ctgaccgtgt 
aaggaccagc 
gtgccctgga 
tggatggagt 
gagagccaga 
agcctgtgga 
atcgtgggcg 
gtgcgccagg 
gaccgccccg 
ctggtgcacg 
taccaccgcc 
cgcggctggg 
aagaacagcg 
cgcatcatcg 
cgccagggct 



tcgagcccct 
gcgccgtgac 
gcgcccgcag 
agcagaacaa 
ggggcatcaa 
agctgctggg 
acgccagctg 
gggagcgcga 
accagcagga 
actggttcga 
gcctggtggg 
gctacagccc 
agggcatcga 
gcctgctggc 
tgcgcgacct 
aggccctgaa 
ccgtgagcct 
aggtggccca 
tcgagcgcgc 



gggcgtggcc 1380 
cctgggcgcc 1440 
cctgaccctg 1500 
cctgctgcgc 1560 
gcagctgcag 1620 
catccggggc 1680 
gagcaacaag 1740 
gatcgacaac 1800 
gaagaacgag 1860 
catcagcaag 1920 
cctgcgcatc 1980 
cctgagcttc 2040 
ggaggagggc 2100 
cctgatctgg 2160 
gatcctgatc 2220 
gtactggggc 2280 
gttcgacgcc 2340 
gcgcatcggc 2400 
cctgctgtaa 2460 
2466 



<210> 50 

<211> 2358 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gp!60 .modSF162 .delVl/V2 



<400> 50 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

accgaggtgc 

gagatcgtgc 

cagatgcacg 

acccccctgt 

aaggtgagct 

aagtgcaacg 

tgcacccacg 

gaggagggcg 

cagctgaagg 

atcaccatcg 

caggcccact 

aagctgcagg 

cccgagatcg 

cagctgttca 

accctgccct 

tacgcccccc 

acccgcgacg 

gacatgcgcg 

ctgggcgtgg 

accctgggcg 

agcctgaccc 

aacctgctgc 

aagcagctgc 

ggcatctggg 

tggagcaaca 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tggagaacgt 
aggacatcat 
gcgtgggcgc 
tcgagcccat 
acaagaagtt 
gcatccgccc 
tggtgatccg 
agagcgtgga 
gccccggccg 
gcaacatcag 
cccagttcgg 
tgatgcacag 
acagcacctg 
gccgcatcaa 
ccatccgcgg 
gcggcaagga 
acaactggcg 
cccccaccaa 
ccatgttcct 
tgaccgtgca 
gcgccatcga 
aggcccgcgt 
gctgcagcgg 
agagcctgga 



aatgaagaga 
cgccgtggag 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
cggcaactgc 
ccccatccac 
caacggcagc 
cgtggtgagc 
cagcgagaac 
gatcaactgc 
cgccttctac 
cggcgagaag 
caacaagacc 
cttcaactgc 
gaacaacacc 
gcagatcatc 
ccagatccgc 
gatcagcaac 
cagcgagctg 
ggccaagcgc 
gggcttcctg 
ggcccgccag 
ggcccagcag 
gctggccgtg 
caagctgatc 
ccagatctgg 



gggctctgct 
aagctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
cagaccagcg 
tactgcgccc 
ggcccctgca 
acccagctgc 
ttcaccgaca 
acccgcccca 
gccaccggcg 
tggaacaaca 
atcgtgttca 
ggcggcgagt 
atcggcccca 
aaccgctggc 
tgcagcagca 
accaccgaga 
tacaagtaca 
cgcgtggtgc 
ggcgccgccg 
ctgctgagcg 
cacctgctgc 
gagcgctacc 
tgcaccaccg 
aacaacatga 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
tgatcaccca 
ccgccggctt 
ccaacgtgag 
tgctgaacgg 
acgccaagac 
acaacaacac 
acatcatcgg 
ccctgaagca 
agcagagcag 
tcttctactg 
acaacaccaa 
aggaggtggg 
acatcaccgg 
tcttccgccc 
aggtggtgaa 
agcgcgagaa 
gcagcaccat 
gcatcgtgca 
agctgaccgt 
tgaaggacca 
ccgtgccctg 
cctggatgga 



gctgtgtgga 60 
ctacggcgtg 120 
ggcctacgac 180 
caacccccag 240 
catggtggag 3 00 
cgtgaagctg 360 
ggcctgcccc 420 
cgccatcctg 480 
caccgtgcag 540 
cagcctggcc 600 
catcatcgtg 660 
ccgcaagagc 720 
cgacatccgc 780 
gatcgtgacc 840 
cggcggcgac 900 
caacagcacc 960 
cggcaccatc 1020 
caaggccatg 1080 
cctgctgctg 1140 
cggcggcggc 1200 
gatcgagccc 1260 
gcgcgccgtg 1320 
gggcgcccgc 1380 
gcagcagaac 1440 
gtggggcatc 1500 
gcagctgctg 1560 
gaacgccagc 1620 
gtgggagcgc 168 0 
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gagatcgaca 
gagaagaacg 
gacatcagca 
ggcctgcgca 
cccctgagct 
gaggaggagg 
gccctgatct 
ctgatcctga 
aagtactggg 
ctgttcgacg 
cagcgcatcg 
gccctgctgt 



actacaccaa 
agcaggagct 
agtggctgtg 
tcgtgttcac 
tccagacccg 
gcggcgagcg 
gggacgacct 
tcgccgcccg 
gcaacctgct 
ccatcgccat 
gccgcgcctt 
aactcgag 



cctgatctac 
gctggagctg 
gtacatcaag 
cgtgctgagc 
cttccccgcc 
cgaccgcgac 
gcgcagcctg 
catcgtggag 
gcagtactgg 
cgccgtggcc 
cctgcacatc 



accctgatcg 
gacaagtggg 
atcttcatca 
atcgtgaacc 
ccccgcggcc 
cgcagcagcc 
tgcctgttca 
ctgctgggcc 
atccaggagc 
gagggcaccg 
ccccgccgca 



aggagagcca 
ccagcctgtg 
tgatcgtggg 
gcgtgcgcca 
ccgaccgccc 
ccctggtgca 
gctaccaccg 
gccgcggctg 
tgaagaacag 
accgcatcat 
tccgccaggg 



gaaccagcag 1740 
gaactggttc 1800 
cggcctggtg 1860 
gggctacagc 1920 
cgagggcatc 1980 
cggcctgctg 2040 
cctgcgcgac 2100 
ggaggccctg 2160 
cgccgtgagc 2220 
cgaggtggcc 2280 
cttcgagcgc 2340 
2358 



<210> 51 

<211> 1494 

<212> DNA 

<213> Human immunodeficiency virus 



<400> 51 

acaacagtct 

actctgtttt 

acacatgcct 

gaaaatttta 

ttatgggatc 

tgtactgata 

actagtggca 

ggagaaataa 

gaatattctc 

agattgataa 

ccaattccca 

aagttcaatg 

agaccagtag 

cttagatctg 

gtagaaatta 

gggagagcat 

attagtaaag 

tttgggaata 

tttcacagtt 

agtacctgga 

ccatgcagaa 

cctcccatca 

gatggtggta 

aacatgaagg 

ttaggagtag 



tgtgggtcac 
gtgcatcaga 
gtgtacccac 
acatgtggaa 
aaagcctaaa 
agttgacagg 
ctaatagtac 
aaaactgctc 
tcttctataa 
attgtaatac 
tacattattg 
gaacaggacc 
tatcaactca 
aaaatttcac 
attgtataag 
tttatgcaac 
caaactggac 
ataaaacaat 
ttaattgtgg 
atattactga 
taagacaaat 
gaggacaaat 
ctaacaataa 
acaattggag 
cacccaccca 



agtctattat 
tgctaaagca 
agaccccaac 
aaataacatg 
gccatgtgta 
tagtactaat 
tagtactaat 
tttcaatatc 
acttgatgta 
ctcagtcatt 
tgccccggct 
atgtaaaaat 
actgctgtta 
agacaatgct 
acccaacaat 
aggtgatata 
taacacttta 
aatctttaat 
aggggaattt 
agaggtaaat 
tataaacatg 
taaatgttca 
taggacgaac 
aagtgaatta 
ggcaaagaga 



ggggtacctg 
tacaaagcag 
ccacaggaag 
gtggaacaga 
aaattaaccc 
ggcacaaata 
agtactgata 
accacaagtg 
gtaccaatag 
acacaagcct 
ggttttgcga 
gtcagcacag 
aatggcagtc 
aaaaccataa 
aatacaagaa 
ataggagaca 
gaacagatag 
tcatcctcag 
ttctattgta 
aagactaaag 
tggcaagaag 
tcaaatatta 
gacaccgaga 
tataaatata 
agagtggtgc 



tgtggaaaga 
aggcacataa 
taaatttaac 
tgcatgagga 
cactctgtgt 
gtactagtgg 
gttgggaaaa 
taagagataa 
ataatgataa 
gtccaaaggt 
ttctaaagtg 
tacaatgcac 
tagcagaaga 
tagtacagct 
aaagtataca 
taagacaagc 
ttgaaaaatt 
gaggggaccc 
atacatcaca 
aaaatgacac 
taggaaaagc 
cagggctgct 
ccttcagacc 
aagtagtaag 
aaagagagaa 



agcaaccacc 60 
cgtctgggct 12 0 
aaatgtgaca 180 
tataatcagt 240 
tactttaaat 300 
cactaatagt 360 
gatgccagaa 420 
agtgcagaaa 480 
tgctagctat 540 
atcttttgaa 600 
taaagataag 660 
acatggaatt 720 
agagatagta 780 
gaatgaatct 84 0 
tataggacca 900 
acattgtaac 960 
aagagaacaa 102 0 
agaaattgta 1080 
actatttaat 1140 
tatcatactc 1200 
aatgtatgcc 1260 
attaactaga 1320 
tgggggagga 13 80 
aattgaacca 1440 
aaga 14 94 



<210> 52 

<211> 2007 

<212> DNA 

<213> Human immunodeficiency virus 



<400> 52 

acaacagtct tgtgggtcac agtctattat 

actctgtttt gtgcatcaga tgctaaagca 

acacatgcct gtgtacccac agaccccaac 

gaaaatttta acatgtggaa aaataacatg 

ttatgggatc aaagcctaaa gccatgtgta 

tgtactgata agttgacagg tagtactaat 

actagtggca ctaatagtac tagtactaat 

ggagaaataa aaaactgctc tttcaatatc 



ggggtacctg tgtggaaaga agcaaccacc 60 

tacaaagcag aggcacataa cgtctgggct 120 

ccacaggaag taaatttaac aaatgtgaca 180 

gtggaacaga tgcatgagga tataatcagt 240 

aaattaaccc cactctgtgt tactttaaat 300 

ggcacaaata gtactagtgg cactaatagt 360 

agtactgata gttgggaaaa gatgccagaa 420 

accacaagtg taagagataa agtgcagaaa 4 80 
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gaatattctc tcttctataa acttgatgta 
agattgataa attgtaatac ctcagtcatt 
ccaattccca tacattattg tgccccggct 
aagttcaatg gaacaggacc atgtaaaaat 
agaccagtag tatcaactca actgctgtta 
cttagatctg aaaatttcac agacaatgct 
gtagaaatta attgtataag acccaacaat 
gggagagcat tttatgcaac aggtgatata 
attagtaaag caaactggac taacacttta 
tttgggaata ataaaacaat aatctttaat 
tttcacagtt ttaattgtgg aggggaattt 
agtacctgga atattactga agaggtaaat 
ccatgcagaa taagacaaat tataaacatg 
cctcccatca gaggacaaat taaatgttca 
gatggtggta ctaacaataa taggacgaac 
aacatgaagg acaattggag aagtgaatta 
ttaggagtag cacccaccca ggcaaagaga 
ggactaggag ctttgttcat tgggttcttg 
tcagtgacgc tgacggtaca ggccagacaa 
aatttgctga gagctattga ggcgcaacag 
aaacagctcc aggcaagaat cctggctgtg 
gggatttggg gttgctctgg aaaactcatt 
tggagtaata aatctctgac tgagatttgg 
gaaattggca attatacagg cttaatatac 
gaaaagaatg aacaagaatt attggaatta 
gatataacaa actggctgtg gtatata 



gtaccaatag ataatgataa tgctagctat 540 
acacaagcct gtccaaaggt atcttttgaa 600 
ggttttgcga ttctaaagtg taaagataag 660 
gtcagcacag tacaatgcac acatggaatt 720 
aatggcagtc tagcagaaga agagatagta 780 
aaaaccataa tagtacagct gaatgaatct 840 
aatacaagaa aaagtataca tataggacca 900 
ataggagaca taagacaagc acattgtaac 960 
gaacagatag ttgaaaaatt aagagaacaa 1020 
tcatcctcag gaggggaccc agaaattgta 1080 
ttctattgta atacatcaca actatttaat 1140 
aagactaaag aaaatgacac tatcatactc 1200 
tggcaagaag taggaaaagc aatgtatgcc 1260 
tcaaatatta cagggctgct attaactaga 1320 
gacaccgaga ccttcagacc tgggggagga 1380 
tataaatata aagtagtaag aattgaacca 1440 
agagtggtgc aaagagagaa aagagcagtg 1500 
ggagcagcag gaagcactat gggcgcagcg 1560 
ttattgtctg gtatagtgca acagcagaac 1620 
catctgttgc aactcacggt ctggggcatc 1680 
gaaagatacc taaaggatca acagctccta 1740 
tgcaccacta ctgtgccttg gaactctagt 1800 
gataatatga cctggatgga gtgggaaaga. 1860 
aatttaattg aaatagcaca aaaccagcaa 1920 
gacaagtggg caagtttgtg gaattggttt 1980 

2007 



<210> 53 

<211> 2532 

<212> DNA 

<213> Human immunodeficiency virus 



<400> 53 

acaacagtct tgtgggtcac agtctattat 
actctgtttt gtgcatcaga tgctaaagca 
acacatgcct gtgtacccac agaccccaac 
gaaaatttta acatgtggaa aaataacatg 
ttatgggatc aaagcctaaa gccatgtgta 
tgtactgata agttgacagg tagtactaat 
actagtggca ctaatagtac tagtactaat 
ggagaaataa aaaactgctc tttcaatatc 
gaatattctc tcttctataa acttgatgta 
agattgataa attgtaatac ctcagtcatt 
ccaattccca tacattattg tgccccggct 
aagttcaatg gaacaggacc atgtaaaaat 
agaccagtag tatcaactca actgctgtta 
cttagatctg aaaatttcac agacaatgct 
gtagaaatta attgtataag acccaacaat 
gggagagcat tttatgcaac aggtgatata 
attagtaaag caaactggac taacacttta 
tttgggaata ataaaacaat aatctttaat 
tttcacagtt ttaattgtgg aggggaattt 
agtacctgga atattactga agaggtaaat 
ccatgcagaa taagacaaat tataaacatg 
cctcccatca gaggacaaat taaatgttca 
gatggtggta ctaacaataa taggacgaac 
aacatgaagg acaattggag aagtgaatta 
ttaggagtag cacccaccca ggcaaagaga 
ggactaggag ctttgttcat tgggttcttg 



ggggtacctg tgtggaaaga agcaaccacc 60 
tacaaagcag aggcacataa cgtctgggct 120 
ccacaggaag taaatttaac aaatgtgaca 180 
gtggaacaga tgcatgagga tataatcagt 240 
aaattaaccc cactctgtgt tactttaaat 300 
ggcacaaata gtactagtgg cactaatagt 360 
agtactgata gttgggaaaa gatgccagaa 420 
accacaagtg taagagataa agtgcagaaa 480 
gtaccaatag ataatgataa tgctagctat 540 
acacaagcct gtccaaaggt atcttttgaa 600 
ggttttgcga ttctaaagtg taaagataag 660 
gtcagcacag tacaatgcac acatggaatt 720 
aatggcagtc tagcagaaga agagatagta 780 
aaaaccataa tagtacagct gaatgaatct 840 
aatacaagaa aaagtataca tataggacca 900 
ataggagaca taagacaagc acattgtaac 960 
gaacagatag ttgaaaaatt aagagaacaa 1020 
tcatcctcag gaggggaccc agaaattgta 1080 
ttctattgta atacatcaca actatttaat 1140 
aagactaaag aaaatgacac tatcatactc 1200 
tggcaagaag taggaaaagc aatgtatgcc 1260 
tcaaatatta cagggctgct attaactaga 1320 
gacaccgaga ccttcagacc tgggggagga 1380 
tataaatata aagtagtaag aattgaacca 1440 
agagtggtgc aaagagagaa aagagcagtg 1500 
ggagcagcag gaagcactat gggcgcagcg 1560 
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tcagtgacgc tgacggtaca ggccagacaa ttattgtctg gtatagtgca acagcagaac 1620 
aatttgctga gagctattga ggcgcaacag catctgttgc aactcacggt ctggggcatc 1680 
aaacagctcc aggcaagaat cctggctgtg gaaagatacc taaaggatca acagctccta 1740 
gggatttggg gttgctctgg aaaactcatt tgcaccacta ctgtgccttg gaactctagt 1800 
tggagtaata aatctctgac tgagatttgg gataatatga cctggatgga gtgggaaaga 1860 
gaaattggca attatacagg cttaatatac aatttaattg aaatagcaca aaaccagcaa 1920 
gaaaagaatg aacaagaatt attggaatta gacaagtggg caagtttgtg gaattggttt 1980 
gatataacaa actggctgtg gtatataaga atattcataa tgatagtagg aggcttgata 2040 
ggtttaagaa tagtttttgc tgtactttct atagtgaata gagttaggca gggatactca 2100 
ccaatatcat tgcagacccg cctcccagct cagaggggac ccgacaggcc cgaaggaatc 2160 
gaagaagaag gtggagagag agacagagac agatccaatc gattagtgca tggattattg 2220 
gcactcatct gggacgatct gcggagcctg tgcctcttca gctaccaccg cttgagagac 2280 
ttactcttga ttgtagcgag gattgtggaa cttctgggac gcagggggtg ggaagccctc 2340 
aagtattggt ggaatctcct gcagtattgg agtcaggagc taaagagtag tgctgttagt 2400 
ttgtttaatg ccacagcaat agcagtagct gaagggacag ataggattat agaaatagta 2460 
caaagaattt ttagagctgt aattcacata cctagaagaa taagacaggg cttggagagg 2520 
gctttactat aa 2532 



<210> 54 
<211> 1599 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: gp!20.modUS4 
<400> 54 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 

aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 480 

gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcaccacc 540 

agcgtgcgcg acaaggtgca gaaggagtac agcctgttct acaagctgga cgtggtgccc 600 

atcgacaacg acaacgccag ctaccgcctg atcaactgca acaccagcgt gatcacccag 660 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 720 

gccatcctga agtgcaagga caagaagttc aacggcaccg gcccctgcaa gaacgtgagc 780 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 84 0 

agcctggccg aggaggagat cgtgctgcgc tccgagaact tcaccgacaa cgccaagacc 900 

atcatcgtgc agctgaacga gtccgtggag atcaactgca tccgccccaa caacaacacg 960 

cgtaagagca tccacatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 1020 

gacatccgcc aggcccactg caacatcagc aaggccaact ggaccaacac cctcgagcag 1080 

atcgtggaga agctgcgcga gcagttcggc aacaacaaga ccatcatctt caacagcagc 114 0 

agcggcggcg accccgagat cgtgttccac agcttcaact gcggcggcga gttcttctac 1200 

tgcaacacca gccagctgtt caacagcacc tggaacatca ccgaggaggt gaacaagacc 1260 

aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 1320 

gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 1380 

attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 1440 

gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 1500 

tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 

gtgcagcgcg agaagcgcta agatatcgga tcctctaga 1599 



<210> 55 

<211> 1350 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: 
gpl20 .modUS4 .del 128-194 



<400> 55 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

gccgaggccc 

gaggtgaacc 

cagatgcatg 

acccccctgt 

aaggtgagct 

aagtgcaagg 

tgcacccacg 

gaggaggaga 

cagctgaacg 

atccacatcg 

caggcccact 

aagctgcgcg 

gaccccgaga 

agccagctgt 

gacaccatca 

aaggccatgt 

ctgctgctga 

cgccccggcg 

gtgcgcatcg 

gagaagcgct 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tgaccaacgt 
aggacatcat 
gcgtgggggc 
tcgagcccat 
acaagaagtt 
gcatccgccc 
tcgtgctgcg 
agtccgtgga 
gccccggccg 
gcaacatcag 
agcagttcgg 
tcgtgttcca 
tcaacagcac 
tcctgccctg 
acgccccccc 
cccgcgacgg 
gcggcaacat 
agcccctggg 
aagatatcgg 



aatgaagaga 
cgccaccacc 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
agggaactgc 
ccccatccac 
caacggcacc 
cgtggtgagc 
ctccgagaac 
gatcaactgc 
cgccttctac 
caaggccaac 
caacaacaag 
cagcttcaac 
ctggaacatc 
ccgcatccgc 
catccgcggc 
cggcaccaac 
gaaggacaac 
cgtggccccc 
atcctctaga 



gggctctgct 
gtgctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
gagaccagcg 
tactgcgccc 
ggcccctgca 
acccagctgc 
ttcaccgaca 
atccgcccca 
gccaccggcg 
tggaccaaca 
accatcatct 
tgcggcggcg 
accgaggagg 
cagatcatca 
cagatcaagt 
aacaaccgca 
tggcgcagcg 
acccaggcca 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
tgatcaccca 
ccgccggctt 
agaacgtgag 
tgctgaacgg 
acgccaagac 
acaacaacac 
acatcatcgg 
ccctcgagca 
tcaacagcag 
agttcttcta 
tgaacaagac 
acatgtggca 
gcagcagcaa 
ccaacgacac 
agctgtacaa 
agcgccgcgt 



gctgtgtgga 6 0 
ctacggcgtg 120 
ggcttacaag 180 
caacccccag 240 
catggtggag 300 
cgtgaagctg 360 
ggcctgcccc 420 
cgccatcctg 480 
caccgtgcag 540 
cagcctggcc 600 
catcatcgtg 660 
gcgtaagagc 720 
cgacatccgc 780 
gatcgtggag 84 0 
cagcggcggc 900 
ctgcaacacc 960 
caaggagaac 1020 
ggaggtgggc 1080 
tattaccggc 1140 
cgagaccttc 1200 
gtacaaggtg 1260 
ggtgcagcgc 1320 
1350 



<210> 56 
<211> 2112 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gpl40.modUS4 



<400> 56 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

gccgaggccc 

gaggtgaacc 

cagatgcatg 

acccccctgt 

aacagcacca 

gacagctggg 

agcgtgcgcg 

atcgacaacg 

gcctgcccca 

gccatcctga 

accgtgcagt 

agcctggccg 

atcatcgtgc 

cgtaagagca 

gacatccgcc 

atcgtggaga 

agcggcggcg 

tgcaacacca 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tgaccaacgt 
aggacatcat 
gcgtgaccct 
gcggcaccaa 
agaagatgcc 
acaaggtgca 
acaacgccag 
aggtgagctt 
agtgcaagga 
gcacccacgg 
aggaggagat 
agctgaacga 
tccacatcgg 
aggcccactg 
agctgcgcga 
accccgagat 
gccagctgtt 



aatgaagaga 
cgccaccacc 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gaactgcacc 
cagcaccagc 
cgagggcgag 
gaaggagtac 
ctaccgcctg 
cgagcccatc 
caagaagttc 
catccgcccc 
cgtgctgcgc 
gtccgtggag 
ccccggccgc 
caacatcagc 
gcagttcggc 
cgtgttccac 
caacagcacc 



gggctctgct 
gtgctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
gacaagctga 
ggcaccaaca 
atcaagaact 
agcctgttct 
atcaactgca 
cccatccact 
aacggcaccg 
gtggtgagca 
tccgagaact 
atcaactgca 
gccttctacg 
aaggccaact 
aacaacaaga 
agcttcaact 
tggaacatca 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
ccggcagcac 
gcaccagcac 
gcagcttcaa 
acaagctgga 
acaccagcgt 
actgcgcccc 
gcccctgcaa 
cccagctgct 
tcaccgacaa 
tccgccccaa 
ccaccggcga 
ggaccaacac 
ccatcatctt 
gcggcggcga 
ccgaggaggt 



gctgtgtgga 60 
ctacggcgtg 120 
ggcttacaag 180 
caacccccag 240 
catggtggag 300 
cgtgaagctg 360 
caacggcacc 420 
caacagcacc 480 
catcaccacc 540 
cgtggtgccc 600 
gatcacccag 660 
cgccggcttc 720 
gaacgtgagc 780 
gctgaacggc 84 0 
cgccaagacc 900 
caacaacacg 960 
catcatcggc 1020 
cctcgagcag 1080 
caacagcagc 1140 
gttcttctac 1200 
gaacaagacc 1260 
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aaggagaacg acaccatcat cctgccctgc cgcatccgcc agatcatcaa catgtggcag 1320 

gaggtgggca aggccatgta cgcccccccc atccgcggcc agatcaagtg cagcagcaat 1380 

attaccggcc tgctgctgac ccgcgacggc ggcaccaaca acaaccgcac caacgacacc 1440 

gagaccttcc gccccggcgg cggcaacatg aaggacaact ggcgcagcga gctgtacaag 1500 

tacaaggtgg tgcgcatcga gcccctgggc gtggccccca cccaggccaa gcgccgcgtg 1560 

gtgcagcgcg agaagcgcgc cgtgggcctg ggcgccctgt tcatcggctt cctgggcgcc 1620 

gccgggagca ccatgggcgc cgcctccgtg accctgaccg tgcaggcccg ccagctgctg 1680 

agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1740 

ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcatcctggc cgtggagcgc 1800 

tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1860 

accaccgtgc cctggaacag cagctggagc aacaagagcc tgaccgagat ctgggacaac 1920 

atgacctgga tggagtggga gcgcgagatc ggcaactaca ccggcctgat ctacaacctg 1980 

atcgagatcg cccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 204 0 

tgggccagcc tgtggaactg gttcgacatc accaactggc tgtggtacat ctaagatatc 2100 
ggatcctcta ga 2112 



<210> 57 

<211> 2112 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl4 0 . mux . modUS4 



<400> 57 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccaccacc 
cccgtgtgga aggaggccac caccaccctg 
gccgaggccc acaacgtgtg ggccacccac 
gaggtgaacc tgaccaacgt gaccgagaac 
cagatgcatg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gaactgcacc 
aacagcacca gcggcaccaa cagcaccagc 
gacagctggg agaagatgcc cgagggcgag 
agcgtgcgcg acaaggtgca gaaggagtac 
atcgacaacg acaacgccag ctaccgcctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaagga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggaggagat cgtgctgcgc 
atcatcgtgc agctgaacga gtccgtggag 
cgtaagagca tccacatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtggaga agctgcgcga gcagttcggc 
agcggcggcg accccgagat cgtgttccac 
tgcaacacca gccagctgtt caacagcacc 
aaggagaacg acaccatcat cctgccctgc 
gaggtgggca aggccatgta cgcccccccc 
attaccggcc tgctgctgac ccgcgacggc 
gagaccttcc gccccggcgg cggcaacatg 
tacaaggtgg tgcgcatcga gcccctgggc 
gtgcagcgcg agaagagcgc cgtgggcctg 
gccgggagca ccatgggcgc cgcctccgtg 
agcggcatcg tgcagcagca gaacaacctg 
ctgcagctga ccgtgtgggg catcaagcag 
tacctgaagg accagcagct gctgggcatc 
accaccgtgc cctggaacag cagctggagc 
atgacctgga tggagtggga gcgcgagatc 



gggctctgct gtgtgctgct gctgtgtgga 60 
gtgctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcttacaag 180 
gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
gacaagctga ccggcagcac caacggcacc 42 0 
ggcaccaaca gcaccagcac caacagcacc 480 
atcaagaact gcagcttcaa catcaccacc 540 
agcctgttct acaagctgga cgtggtgccc 600 
atcaactgca acaccagcgt gatcacccag 660 
cccatccact actgcgcccc cgccggcttc 720 
aacggcaccg gcccctgcaa gaacgtgagc 780 
gtggtgagca cccagctgct gctgaacggc 84 0 
tccgagaact tcaccgacaa cgccaagacc 900 
atcaactgca tccgccccaa caacaacacg 960 
gccttctacg ccaccggcga catcatcggc 1020 
aaggccaact ggaccaacac cctcgagcag 1080 
aacaacaaga ccatcatctt caacagcagc 1140 
agcttcaact gcggcggcga gttcttctac 1200 
tggaacatca ccgaggaggt gaacaagacc 1260 
cgcatccgcc agatcatcaa catgtggcag 1320 
atccgcggcc agatcaagtg cagcagcaat 13 80 
ggcaccaaca acaaccgcac caacgacacc 1440 
aaggacaact ggcgcagcga gctgtacaag 1500 
gtggccccca cccaggccaa gcgccgcgtg 1560 
ggcgccctgt tcatcggctt cctgggcgcc 1620 
accctgaccg tgcaggcccg ccagctgctg 1680 
ctgcgcgcca tcgaggccca gcagcacctg 174 0 
ctgcaggccc gcatcctggc cgtggagcgc 1800 
tggggctgca gcggcaagct gatctgcacc 1860 
aacaagagcc tgaccgagat ctgggacaac 1920 
ggcaactaca ccggcctgat ctacaacctg 1980 
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atcgagatcg cccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 2040 
tgggccagcc tgtggaactg gttcgacatc accaactggc tgtggtacat ctaagatatc 2100 
ggatcctcta ga 2112 

<210> 58 
<211> 2181 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: gp!40TM . modUS4 



<400> 58 

gaattcgcca ccatggatgc 
gcagtcttcg tttcgcccag 
cccgtgtgga aggaggccac 
gccgaggccc acaacgtgtg 
gaggtgaacc tgaccaacgt 
cagatgcatg aggacatcat 
acccccctgt gcgtgaccct 
aacagcacca gcggcaccaa 
gacagctggg agaagatgcc 
agcgtgcgcg acaaggtgca 
atcgacaacg acaacgccag 
gcctgcccca aggtgagctt 
gccatcctga agtgcaagga 
accgtgcagt gcacccacgg 
agcctggccg aggaggagat 
atcatcgtgc agctgaacga 
cgtaagagca tccacatcgg 
gacatccgcc aggcccactg 
atcgtggaga agctgcgcga 
agcggcggcg accccgagat 
tgcaacacca gccagctgtt 
aaggagaacg acaccatcat 
gaggtgggca aggccatgta 
attaccggcc tgctgctgac 
gagaccttcc gccccggcgg 
tacaaggtgg tgcgcatcga 
gtgcagcgcg agaagcgcgc 
gccgggagca ccatgggcgc 
agcggcatcg tgcagcagca 
ctgcagctga ccgtgtgggg 
tacctgaagg accagcagct 
accaccgtgc cctggaacag 
atgacctgga tggagtggga 
atcgagatcg cccagaacca 
tgggccagcc tgtggaactg 
atcatgatcg tgggcggcct 
taagatatcg gatcctctag 



aatgaagaga gggctctgct 
cgccaccacc gtgctgtggg 
caccaccctg ttctgcgcca 
ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 
cagcctgtgg gaccagagcc 
gaactgcacc gacaagctga 
cagcaccagc ggcaccaaca 
cgagggcgag atcaagaact 
gaaggagtac agcctgttct 
ctaccgcctg atcaactgca 
cgagcccatc cccatccact 
caagaagttc aacggcaccg 
catccgcccc gtggtgagca 
cgtgctgcgc tccgagaact 
gtccgtggag atcaactgca 
ccccggccgc gccttctacg 
caacatcagc aaggccaact 
gcagttcggc aacaacaaga 
cgtgttccac agcttcaact 
caacagcacc tggaacatca 
cctgccctgc cgcatccgcc 
cgcccccccc atccgcggcc 
ccgcgacggc ggcaccaaca 
cggcaacatg aaggacaact 
gcccctgggc gtggccccca 
cgtgggcctg ggcgccctgt 
cgcctccgtg accctgaccg 
gaacaacctg ctgcgcgcca 
catcaagcag ctgcaggccc 
gctgggcatc tggggctgca 
cagctggagc aacaagagcc 
gcgcgagatc ggcaactaca 
gcaggagaag aacgagcagg 
gttcgacatc accaactggc 
gatcggcctg cgcatcgtgt 
a 



gtgtgctgct gctgtgtgga 60 

tgaccgtgta ctacggcgtg 120 

gcgacgccaa ggcttacaag 180 

ccaccgaccc caacccccag 240 

ggaagaacaa catggtggag 300 

tgaagccctg cgtgaagctg 360 

ccggcagcac caacggcacc 42 0 

gcaccagcac caacagcacc 480 

gcagcttcaa catcaccacc 540 

acaagctgga cgtggtgccc 600 

acaccagcgt gatcacccag 660 

actgcgcccc cgccggcttc 720 

gcccctgcaa gaacgtgagc 780 

cccagctgct gctgaacggc 840 

tcaccgacaa cgccaagacc 900 

tccgccccaa caacaacacg 960 

ccaccggcga catcatcggc 1020 

ggaccaacac cctcgagcag 1080 

ccatcatctt caacagcagc 1140 

gcggcggcga gttcttctac 1200 

ccgaggaggt gaacaagacc 1260 

agatcatcaa catgtggcag 1320 

agatcaagtg cagcagcaat 1380 

acaaccgcac caacgacacc 1440 

ggcgcagcga gctgtacaag 1500 

cccaggccaa gcgccgcgtg 1560 

tcatcggctt cctgggcgcc 162 0 

tgcaggcccg ccagctgctg 1680 

tcgaggccca gcagcacctg 174 0 

gcatcctggc cgtggagcgc 1800 

gcggcaagct gatctgcacc 1860 

tgaccgagat ctgggacaac 192 0 

ccggcctgat ctacaacctg 1980 

agctgctgga gctggacaag 2040 

tgtggtacat ccgcatcttc 2100 

tcgccgtgct gagcatcgtg 2160 

2181 



<210> 59 
<211> 1818 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl4 0 . modUS4 . delVl/V2 
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<400> 59 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 18 0 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 3 00 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 360 

ggccaggcct gccccaaggt gagcttcgag cccatcccca tccactactg cgcccccgcc 420 

ggcttcgcca tcctgaagtg caaggacaag aagttcaacg gcaccggccc ctgcaagaac 480 

gtgagcaccg tgcagtgcac ccacggcatc cgccccgtgg tgagcaccca gctgctgctg 540 

aacggcagcc tggccgagga ggagatcgtg ctgcgctccg agaacttcac cgacaacgcc 600 

aagaccatca tcgtgcagct gaacgagtcc gtggagatca actgcatccg ccccaacaac 660 

aacacgcgta agagcatcca catcggcccc ggccgcgcct tctacgccac cggcgacatc 720 

atcggcgaca tccgccaggc ccactgcaac atcagcaagg ccaactggac caacaccctc 780 

gagcagatcg tggagaagct gcgcgagcag ttcggcaaca acaagaccat catcttcaac 84 0 

agcagcagcg gcggcgaccc cgagatcgtg ttccacagct tcaactgcgg cggcgagttc 900 

ttctactgca acaccagcca gctgttcaac agcacctgga acatcaccga ggaggtgaac 960 

aagaccaagg agaacgacac catcatcctg ccctgccgca tccgccagat catcaacatg 1020 

tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat caagtgcagc 10 80 

agcaatatta ccggcctgct gctgacccgc gacggcggca ccaacaacaa ccgcaccaac 114 0 

gacaccgaga cctcccgccc cggcggcggc aacatgaagg acaactggcg cagcgagctg 1200 

tacaagtaca aggtggtgcg catcgagccc ctgggcgtgg cccccaccca ggccaagcgc 1260 

cgcgtggtgc agcgcgagaa gcgcgccgtg ggcctgggcg ccctgttcat cggcttcctg 1320 

ggcgccgccg ggagcaccat gggcgccgcc tccgtgaccc tgaccgtgca ggcccgccag 13 80 

ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 144 0 

cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcat cctggccgtg 1500 

gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 

tgcaccacca ccgtgccctg gaacagcagc tggagcaaca agagcctgac cgagatctgg 162 0 

gacaacatga cctggatgga gtgggagcgc gagatcggca actacaccgg cctgatctac 1680 

aacctgatcg agatcgccca gaaccagcag gagaagaacg agcaggagct gctggagctg 174 0 

gacaagtggg ccagcctgtg gaactggttc gacatcacca actggctgtg gtacatctaa 1800 
gatatcggat cctctaga 1818 

<210> 60 
<211> 2031 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl4 0 .modUS4 .delV2 

<400> 60 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 42 0 

aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 4 80 

gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcggcgcc 54 0 

ggccgcctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 600 

gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaaggac 660 

aagaagttca acggcaccgg cccctgcaag aacgtgagca ccgtgcagtg cacccacggc 72 0 

atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggaggagatc 780 

gtgctgcgct ccgagaactt caccgacaac gccaagacca tcatcgtgca gctgaacgag 84 0 

tccgtggaga tcaactgcat ccgccccaac aacaacacgc gtaagagcat ccacatcggc 900 

cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 96 0 
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aacatcagca 
cagttcggca 
gtgttccaca 
aacagcacct 
ctgccctgcc 
gcccccccca 
cgcgacggcg 
ggcaacatga 
cccctgggcg 
gtgggcctgg 
gcctccgtga 
aacaacctgc 
atcaagcagc 
ctgggcatct 
agctggagca 
cgcgagatcg 
caggagaaga 
ttcgacatca 



aggccaactg 
acaacaagac 
gcttcaactg 
ggaacatcac 
gcatccgcca 
tccgcggcca 
gcaccaacaa 
aggacaactg 
tggcccccac 
gcgccctgtt 
ccctgaccgt 
tgcgcgccat 
tgcaggcccg 
ggggctgcag 
acaagagcct 
gcaactacac 
acgagcagga 
ccaactggct 



gaccaacacc 
catcatcttc 
cggcggcgag 
cgaggaggtg 
gatcatcaac 
gatcaagtgc 
caaccgcacc 
gcgcagcgag 
ccaggccaag 
catcggcttc 
gcaggcccgc 
cgaggcccag 
catcctggcc 
cggcaagctg 
gaccgagatc 
cggcctgatc 
gctgctggag 
gtggtacatc 



ctcgagcaga 
aacagcagca 
ttcttctact 
aacaagacca 
atgtggcagg 
agcagcaata 
aacgacaccg 
ctgtacaagt 
cgccgcgtgg 
ctgggcgccg 
cagctgctga 
cagcacctgc 
gtggagcgct 
atctgcacca 
tgggacaaca 
tacaacctga 
ctggacaagt 
taagatatcg 



tcgtggagaa 
gcggcggcga 
gcaacaccag 
aggagaacga 
aggtgggcaa 
ttaccggcct 
agaccttccg 
acaaggtggt 
tgcagcgcga 
ccgggagcac 
gcggcatcgt 
tgcagctgac 
acctgaagga 
ccaccgtgcc 
tgacctggat 
tcgagatcgc 
gggccagcct 
gatcctctag 



gctgcgcgag 1020 
ccccgagatc 1080 
ccagctgttc 1140 
caccatcatc 1200 
ggccatgtac 1260 
gctgctgacc 1320 
ccccggcggc 1380 
gcgcatcgag 1440 
gaagcgcgcc 1500 
catgggcgcc 1560 
gcagcagcag 1620 
cgtgtggggc 1680 
ccagcagctg 1740 
ctggaacagc 1800 
ggagtgggag 1860 
ccagaaccag 1920 
gtggaactgg 1980 
a 2031 



<210> 61 

<211> 1818 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gp!40 .mut .modUS4 . delVl/V2 



<400> 61 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccaccacc 
cccgtgtgga aggaggccac caccaccctg 
gccgaggccc acaacgtgtg ggccacccac 
gaggtgaacc tgaccaacgt gaccgagaac 
cagatgcatg aggacatcat cagcctgtgg 
ggccaggcct gccccaaggt gagcttcgag 
ggcttcgcca tcctgaagtg caaggacaag 
gtgagcaccg tgcagtgcac ccacggcatc 
aacggcagcc tggccgagga ggagatcgtg 
aagaccatca tcgtgcagct gaacgagtcc 
aacacgcgta agagcatcca catcggcccc 
atcggcgaca tccgccaggc ccactgcaac 
gagcagatcg tggagaagct gcgcgagcag 
agcagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acaccagcca gctgttcaac 
aagaccaagg agaacgacac catcatcctg 
tggcaggagg tgggcaaggc catgtacgcc 
agcaatatta ccggcctgct gctgacccgc 
gacaccgaga ccttccgccc cggcggcggc 
tacaagtaca aggtggtgcg catcgagccc 
cgcgtggtgc agcgcgagaa gagcgccgtg 
ggcgccgccg ggagcaccat gggcgccgcc 
ctgctgagcg gcatcgtgca gcagcagaac 
cacctgctgc agctgaccgt gtgjgggcatc 
gagcgctacc tgaaggacca gcagctgctg 
tgcaccacca ccgtgccctg gaacagcagc 
gacaacatga cctggatgga gtgggagcgc 
aacctgatcg agatcgccca gaaccagcag 
gacaagtggg ccagcctgtg gaactggttc 



gggctctgct gtgtgctgct gctgtgtgga 60 
gtgctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcttacaag 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgggcgcc 360 
cccatcccca tccactactg cgcccccgcc 420 
aagttcaacg gcaccggccc ctgcaagaac 480 
cgccccgtgg tgagcaccca gctgctgctg 540 
ctgcgctccg agaacttcac cgacaacgcc 600 
gtggagatca actgcatccg ccccaacaac 660 
ggccgcgcct tctacgccac cggcgacatc 720 
atcagcaagg ccaactggac caacaccctc 780 
ttcggcaaca acaagaccat catcttcaac 840 
ttccacagct tcaactgcgg cggcgagttc 900 
agcacctgga acatcaccga ggaggtgaac 960 
ccctgccgca tccgccagat catcaacatg 1020 
ccccccatcc gcggccagat caagtgcagc 1080 
gacggcggca ccaacaacaa ccgcaccaac 114 0 
aacatgaagg acaactggcg cagcgagctg 1200 
ctgggcgtgg cccccaccca ggccaagcgc 1260 
ggcctgggcg ccctgttcat cggcttcctg 1320 
tccgtgaccc tgaccgtgca ggcccgccag 1380 
aacctgctgc gcgccatcga ggcccagcag 1440 
aagcagctgc aggcccgcat cctggccgtg 1500 
ggcatctggg gctgcagcgg caagctgatc 1560 
tggagcaaca agagcctgac cgagatctgg 162 0 
gagatcggca actacaccgg cctgatctac 1680 
gagaagaacg agcaggagct gctggagctg 174 0 
gacatcacca actggctgtg gtacatctaa 1800 
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gatatcggat cctctaga 1818 

<210> 62 
<211> 1818 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl4 0.modUS4.del 12 8-194 



<400> 62 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 360 

ggccaggcct gccccaaggt gagcttcgag cccatcccca tccactactg cgcccccgcc 420 

ggcttcgcca tcctgaagtg caaggacaag aagttcaacg gcaccggccc ctgcaagaac 480 

gtgagcaccg tgcagtgcac ccacggcatc cgccccgtgg tgagcaccca gctgctgctg 54 0 

aacggcagcc tggccgagga ggagatcgtg ctgcgctccg agaacttcac cgacaacgcc 600 

aagaccatca tcgtgcagct gaacgagtcc gtggagatca actgcatccg ccccaacaac 660 

aacacgcgta agagcatcca catcggcccc ggccgcgcct tctacgccac cggcgacatc 720 

atcggcgaca tccgccaggc ccactgcaac atcagcaagg ccaactggac caacaccctc 780 

gagcagatcg tggagaagct gcgcgagcag ttcggcaaca acaagaccat catcttcaac 84 0 

agcagcagcg gcggcgaccc cgagatcgtg ttccacagct tcaactgcgg cggcgagttc 900 

ttctactgca acaccagcca gctgttcaac agcacctgga acatcaccga ggaggtgaac 960 

aagaccaagg agaacgacac catcatcctg ccctgccgca tccgccagat catcaacatg 1020 

tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat caagtgcagc 1080 

agcaatatta ccggcctgct gctgacccgc gacggcggca ccaacaacaa ccgcaccaac 1140 

gacaccgaga ccttccgccc cggcggcggc aacatgaagg acaactggcg cagcgagctg 12 00 

tacaagtaca aggtggtgcg catcgagccc ctgggcgtgg cccccaccca ggccaagcgc 1260 

cgcgtggtgc agcgcgagaa gagcgccgtg ggcctgggcg ccctgttcat cggcttcctg 1320 

ggcgccgccg ggagcaccat gggcgccgcc tccgtgaccc tgaccgtgca ggcccgccag 13 80 

ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 144 0 

cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcat cctggccgtg 1500 

gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 

tgcaccacca ccgtgccctg gaacagcagc tggagcaaca agagcctgac cgagatctgg 1620 

gacaacatga cctggatgga gtgggagcgc gagatcggca actacaccgg cctgatctac 1680 

aacctgatcg agatcgccca gaaccagcag gagaagaacg agcaggagct gctggagctg 1740 

gacaagtggg ccagcctgtg gaactggttc gacatcacca actggctgtg gtacatctaa 1800 

gatatcggat cctctaga 1818 



<210> 63 

<211> 1863 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl4 0 .mut .modUS4 -del 128-194 



<400> 63 

gaattcgcca ccatggatgc 
gcagtcttcg tttcgcccag 
cccgtgtgga aggaggccac 
gccgaggccc acaacgtgtg 
g a 9gtgaacc tgaccaacgt 



aatgaagaga gggctctgct 
cgccaccacc gtgctgtggg 
caccaccctg ttctgcgcca 
ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 



gtgtgctgct gctgtgtgga 60 
tgaccgtgta ctacggcgtg 120 
gcgacgccaa ggcttacaag 180 
ccaccgaccc caacccccag 240 
ggaagaacaa catggtggag 3 00 
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cagatgcatg 
acccccctgt 
aaggtgagct 
aagtgcaagg 
tgcacccacg 
gaggaggaga 
cagctgaacg 
atccacatcg 
caggcccact 
aagctgcgcg 
gaccccgaga 
agccagctgt 
gacaccatca 
aaggccatgt 
ctgctgctga 
cgccccggcg 
gtgcgcatcg 
gagaagagcg 
accatgggcg 
gtgcagcagc 
accgtgtggg 
gaccagcagc 
ccctggaaca 
atggagtggg 
gcccagaacc 
ctgtggaact 
aga 



aggacatcat 
gcgtgggggc 
tcgagcccat 
acaagaagtt 
gcatccgccc 
tcgtgctgcg 
agtccgtgga 
gccccggccg 
gcaacatcag 
agcagttcgg 
tcgtgttcca 
tcaacagcac 
tcctgccctg 
acgccccccc 
cccgcgacgg 
gcggcaacat 
agcccctggg 
ccgtgggcct 
ccgcctccgt 
agaacaacct 
gcatcaagca 
tgctgggcat 
gcagctggag 
agcgcgagat 
agcaggagaa 
ggttcgacat 



cagcctgtgg 
agggaactgc 
ccccatccac 
caacggcacc 
cgtggtgagc 
ctccgagaac 
gatcaactgc 
cgccttctac 
caaggccaac 
caacaacaag 
cagcttcaac 
ctggaacatc 
ccgcatccgc 
catccgcggc 
cggcaccaac 
gaaggacaac 
cgtggccccc 
gggcgccctg 
gaccctgacc 
gctgcgcgcc 
gctgcaggcc 
ctggggctgc 
caacaagagc 
cggcaactac 
gaacgagcag 
caccaactgg 



gaccagagcc 
gagaccagcg 
tactgcgccc 
ggcccctgca 
acccagctgc 
ttcaccgaca 
atccgcccca 
gccaccggcg 
tggaccaaca 
accatcatct 
tgcggcggcg 
accgaggagg 
cagatcatca 
cagatcaagt 
aacaaccgca 
tggcgcagcg 
acccaggcca 
ttcatcggct 
gtgcaggccc 
atcgaggccc 
cgcatcctgg 
agcggcaagc 
ctgaccgaga 
accggcctga 
gagctgctgg 
ctgtggtaca 



tgaagccctg 
tgatcaccca 
ccgccggctt 
agaacgtgag 
tgctgaacgg 
acgccaagac 
acaacaacac 
acatcatcgg 
ccctcgagca 
tcaacagcag 
agttcttcta 
tgaacaagac 
acatgtggca 
gcagcagcaa 
ccaacgacac 
agctgtacaa 
agcgccgcgt 
tcctgggcgc 
gccagctgct 
agcagcacct 
ccgtggagcg 
tgatctgcac 
tctgggacaa 
tctacaacct 
agctggacaa 
tctaagatat 



cgtgaagctg 
ggcctgcccc 
cgccatcctg 
caccgtgcag 
cagcctggcc 
catcatcgtg 
gcgtaagagc 
cgacatccgc 
gatcgtggag 
cagcggcggc 
ctgcaacacc 
caaggagaac 
ggaggtgggc 
tattaccggc 
cgagaccttc 
gtacaaggtg 
ggtgcagcgc 
cgccgggagc 
gagcggcatc 
gctgcagctg 
ctacctgaag 
caccaccgtg 
catgacctgg 
gatcgagatc 
gtgggccagc 
cggatcctct 



360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1863 



<210> 64 

<211> 2634 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: gp!60 .modUS4 



<400> 64 

gaattcgcca 

gcagtcttcg 

cccgtgtgga 

gccgaggccc 

gaggtgaacc 

cagatgcatg 

acccccctgt 

aacagcacca 

gacagctggg 

agcgtgcgcg 

atcgacaacg 

gcctgcccca 

gccatcctga 

accgtgcagt 

agcctggccg 

atcatcgtgc 

cgtaagagca 

gacatccgcc 

atcgtggaga 

agcggcggcg 

tgcaacacca 
aaggagaacg 



ccatggatgc 
tttcgcccag 
aggaggccac 
acaacgtgtg 
tgaccaacgt 
aggacatcat 
gcgtgaccct 
gcggcaccaa 
agaagatgcc 
acaaggtgca 
acaacgccag 
aggtgagctt 
agtgcaagga 
gcacccacgg 
aggaggagat 
agctgaacga 
tccacatcgg 
aggcccactg 
agctgcgcga 
accccgagat 
gccagctgtt 
acaccatcat 



aatgaagaga 
cgccaccacc 
caccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gaactgcacc 
cagcaccagc 
cgagggcgag 
gaaggagtac 
ctaccgcctg 
cgagcccatc 
caagaagttc 
catccgcccc 
cgtgctgcgc 
gtccgtggag 
ccccggccgc 
caacatcagc 
gcagttcggc 
cgtgttccac 
caacagcacc 
cctgccctgc 



gggctctgct 
gtgctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
gacaagctga 
ggcaccaaca 
atcaagaact 
agcctgttct 
atcaactgca 
cccatccact 
aacggcaccg 
gtggtgagca 
tccgagaact 
atcaactgca 
gccttctacg 
aaggccaact 
aacaacaaga 
agcttcaact 
tggaacatca 
cgcatccgcc 



gtgtgctgct 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacaa 
tgaagccctg 
ccggcagcac 
gcaccagcac 
gcagcttcaa 
acaagctgga 
acaccagcgt 
actgcgcccc 
gcccctgcaa 
cccagctgct 
tcaccgacaa 
tccgccccaa 
ccaccggcga 
ggaccaacac 
ccatcatctt 
gcggcggcga 
ccgaggaggt 
agatcatcaa 



gctgtgtgga 
ctacggcgtg 
ggcttacaag 
caacccccag 
catggtggag 
cgtgaagctg 
caacggcacc 
caacagcacc 
catcaccacc 
cgtggtgccc 
gatcacccag 
cgccggcttc 
gaacgtgagc 
gctgaacggc 
cgccaagacc 
caacaacacg 
catcatcggc 
cctcgagcag 
caacagcagc 
gttcttctac 
gaacaagacc 
catgtggcag 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 
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gaggtgggca 
attaccggcc 
gagaccttcc 
tacaaggtgg 
gtgcagcgcg 
gccgggagca 
agcggcatcg 
ctgcagctga 
tacctgaagg 
accaccgtgc 
atgacctgga 
atcgagatcg 
tgggccagcc 
atcatgatcg 
aaccgcgtgc 
ggccccgacc 
aaccgcctgg 
ttcagctacc 
ggccgccgcg 
gagctgaaga 
accgaccgca 
cgcatccgcc 



aggccatgta 
tgctgctgac 
gccccggcgg 
tgcgcatcga 
agaagcgcgc 
ccatgggcgc 
tgcagcagca 
ccgtgtgggg 
accagcagct 
cctggaacag 
tggagtggga 
cccagaacca 
tgtggaactg 
tgggcggcct 
gccagggcta 
gccccgaggg 
tgcacggcct 
accgcctgcg 
gctgggaggc 
gcagcgccgt 
tcatcgagat 
agggcctgga 



cgcccccccc 
ccgcgacggc 
cggcaacatg 
gcccctgggc 
cgtgggcctg 
cgcctccgtg 
gaacaacctg 
catcaagcag 
gctgggcatc 
cagctggagc 
gcgcgagatc 
gcaggagaag 
gttcgacatc 
gatcggcctg 
cagccccatc 
catcgaggag 
gctggccctg 
cgacctgctg 
cctgaagtac 
gagcctgttc 
cgtgcagcgc 
gcgcgccctg 



atccgcggcc 
ggcaccaaca 
aaggacaact 
gtggccccca 
ggcgccctgt 
accctgaccg 
ctgcgcgcca 
ctgcaggccc 
tggggctgca 
aacaagagcc 
ggcaactaca 
aacgagcagg 
accaactggc 
cgcatcgtgt 
agcctgcaga 
gagggcggcg 
atctgggacg 
ctgatcgtgg 
tggtggaacc 
aacgccaccg 
atcttccgcg 
ctgtaagata 



agatcaagtg 
acaaccgcac 
ggcgcagcga 
cccaggccaa 
tcatcggctt 
tgcaggcccg 
tcgaggccca 
gcatcctggc 
gcggcaagct 
tgaccgagat 
ccggcctgat 
agctgctgga 
tgtggtacat 
tcgccgtgct 
cccgcctgcc 
agcgcgaccg 
acctgcgcag 
cccgcatcgt 
tgctgcagta 
ccatcgccgt 
ccgtgatcca 
tcggatcctc 



cagcagcaat 1380 
caacgacacc 1440 
gctgtacaag 1500 
gcgccgcgtg 1560 
cctgggcgcc 1620 
ccagctgctg 1680 
gcagcacctg 1740 
cgtggagcgc 1800 
gatctgcacc 1860 
ctgggacaac 1920 
ctacaacctg 1980 
gctggacaag 2040 
ccgcatcttc 2100 
gagcatcgtg 2160 
cgcccagcgc 2220 
cgaccgcagc 2280 
cctgtgcctg 2340 
ggagctgctg 24 00 
ctggagccag 24 60 
ggccgagggc 252 0 
catcccccgc 2580 
taga 2634 



<210> 65 
<211> 2538 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gp!60.modUS4 .del VI 



<400> 65 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 12 0 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gaactgcacc gacaagctgg gcgccggcgg cgagatcaag 420 

aactgcagct tcaacatcac caccagcgtg cgcgacaagg tgcagaagga gtacagcctg 4 80 

ttctacaagc tggacgtggt gcccatcgac aacgacaacg ccagctaccg cctgatcaac 54 0 

tgcaacacca gcgtgatcac ccaggcctgc cccaaggtga gcttcgagcc catccccatc 600 

cactactgcg cccccgccgg cttcgccatc ctgaagtgca aggacaagaa gttcaacggc 660 

accggcccct gcaagaacgt gagcaccgtg cagtgcaccc acggcatccg ccccgtggtg 720 

agcacccagc tgctgctgaa cggcagcctg gccgaggagg agatcgtgct gcgctccgag 780 

aacttcaccg acaacgccaa gaccatcatc gtgcagctga acgagtccgt ggagatcaac 84 0 

tgcatccgcc ccaacaacaa cacgcgtaag agcatccaca tcggccccgg ccgcgccttc 900 

tacgccaccg gcgacatcat cggcgacatc cgccaggccc actgcaacat cagcaaggcc 960 

aactggacca acaccctcga gcagatcgtg gagaagctgc gcgagcagtt cggcaacaac 102 0 

aagaccatca tcttcaacag cagcagcggc ggcgaccccg agatcgtgtt ccacagcttc 1080 

aactgcggcg gcgagttctt ctactgcaac accagccagc tgttcaacag cacctggaac 114 0 

atcaccgagg aggtgaacaa gaccaaggag aacgacacca tcatcctgcc ctgccgcatc 1200 

cgccagatca tcaacatgtg gcaggaggtg ggcaaggcca tgtacgcccc ccccatccgc 1260 

ggccagatca agtgcagcag caatattacc ggcctgctgc tgacccgcga cggcggcacc 1320 

aacaacaacc gcaccaacga caccgagacc ttccgccccg gcggcggcaa catgaaggac 1380 

aactggcgca gcgagctgta caagtacaag gtggtgcgca tcgagcccct gggcgtggcc 144 0 

cccacccagg ccaagcgccg cgtggtgcag cgcgagaagc gcgccgtggg cctgggcgcc 1500 

ctgttcatcg gcttcctggg cgccgccggg agcaccatgg gcgccgcctc cgtgaccctg 1560 
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accgtgcagg cccgccagct gctgagcggc atcgtgcagc agcagaacaa cctgctgcgc 1620 
gccatcgagg cccagcagca cctgctgcag ctgaccgtgt ggggcatcaa gcagctgcag 1680 
gcccgcatcc tggccgtgga gcgctacctg aaggaccagc agctgctggg catctggggc 174 0 
tgcagcggca agctgatctg caccaccacc gtgccctgga acagcagctg gagcaacaag 1800 
agcctgaccg agatctggga caacatgacc tggatggagt gggagcgcga gatcggcaac 1860 
tacaccggcc tgatctacaa cctgatcgag atcgcccaga accagcagga gaagaacgag 192 0 
caggagctgc tggagctgga caagtgggcc agcctgtgga actggttcga catcaccaac 1980 
tggctgtggt acatccgcat cttcatcatg atcgtgggcg gcctgatcgg cctgcgcatc 2040 
gtgttcgccg tgctgagcat cgtgaaccgc gtgcgccagg gctacagccc catcagcctg 2100 
cagacccgcc tgcccgccca gcgcggcccc gaccgccccg agggcatcga ggaggagggc 2160 
ggcgagcgcg accgcgaccg cagcaaccgc ctggtgcacg gcctgctggc cctgatctgg 222 0 
gacgacctgc gcagcctgtg cctgttcagc taccaccgcc tgcgcgacct gctgctgatc 2280 
gtggcccgca tcgtggagct gctgggccgc cgcggctggg aggccctgaa gtactggtgg 234 0 
aacctgctgc agtactggag ccaggagctg aagagcagcg ccgtgagcct gttcaacgcc 2400 
accgccatcg ccgtggccga gggcaccgac cgcatcatcg agatcgtgca gcgcatcttc 2460 
cgcgccgtga tccacatccc ccgccgcatc cgccagggcc tggagcgcgc cctgctgtaa 2520 
gatatcggat cctctaga 2538 

<210> 66 
<211> 2553 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 . modUS 4 . delV2 

<400> 66 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gaactgcacc gacaagctga ccggcagcac caacggcacc 420 
aacagcacca gcggcaccaa cagcaccagc ggcaccaaca gcaccagcac caacagcacc 4 80 
gacagctggg agaagatgcc cgagggcgag atcaagaact gcagcttcaa catcggcgcc 54 0 
ggccgcctga tcaactgcaa caccagcgtg atcacccagg cctgccccaa ggtgagcttc 600 
gagcccatcc ccatccacta ctgcgccccc gccggcttcg ccatcctgaa gtgcaaggac 660 
aagaagttca acggcaccgg cccctgcaag aacgtgagca ccgtgcagtg cacccacggc 720 
atccgccccg tggtgagcac ccagctgctg ctgaacggca gcctggccga ggaggagatc 780 
gtgctgcgct ccgagaactt caccgacaac gccaagacca tcatcgtgca gctgaacgag 840 
tccgtggaga tcaactgcat ccgccccaac aacaacacgc gtaagagcat ccacatcggc 900 
cccggccgcg ccttctacgc caccggcgac atcatcggcg acatccgcca ggcccactgc 960 
aacatcagca aggccaactg gaccaacacc ctcgagcaga tcgtggagaa gctgcgcgag 102 0 
cagttcggca acaacaagac catcatcttc aacagcagca gcggcggcga ccccgagatc 1080 
gtgttccaca gcttcaactg cggcggcgag ttcttctact gcaacaccag ccagctgttc 1140 
aacagcacct ggaacatcac cgaggaggtg aacaagacca aggagaacga caccatcatc 1200 
ctgccctgcc gcatccgcca gatcatcaac atgtggcagg aggtgggcaa ggccatgtac 1260 
gcccccccca tccgcggcca gatcaagtgc agcagcaata ttaccggcct gctgctgacc 1320 
cgcgacggcg gcaccaacaa caaccgcacc aacgacaccg agaccttccg ccccggcggc 1380 
ggcaacatga aggacaactg gcgcagcgag ctgtacaagt acaaggtggt gcgcatcgag 144 0 
cccctgggcg tggcccccac ccaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 
gtgggcctgg gcgccctgtt catcggcttc ctgggcgccg ccgggagcac catgggcgcc 1560 
gcctccgtga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 
aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 
atcaagcagc tgcaggcccg catcctggcc gtggagcgct acctgaagga ccagcagctg 174 0 
ctgggcatct ggggctgcag cggcaagctg atctgcacca ccaccgtgcc ctggaacagc 1800 
agctggagca acaagagcct gaccgagatc tgggacaaca tgacctggat ggagtgggag i860 
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cgcgagatcg gcaactacac cggcctgatc tacaacctga tcgagatcgc ccagaaccag 192 0 

caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 

ttcgacatca ccaactggct gtggtacatc cgcatcttca tcatgatcgt gggcggcctg 204 0 

atcggcctgc gcatcgtgtt cgccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 

agccccatca gcctgcagac ccgcctgccc gcccagcgcg gccccgaccg ccccgagggc 2160 

atcgaggagg agggcggcga gcgcgaccgc gaccgcagca accgcctggt gcacggcctg 2220 

ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 

gacctgctgc tgatcgtggc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 234 0 

ctgaagtact ggtggaacct gctgcagtac tggagccagg agctgaagag cagcgccgtg 24 00 

agcctgttca acgccaccgc catcgccgtg gccgagggca ccgaccgcat catcgagatc 2460 

gtgcagcgca tcttccgcgc cgtgatccac atcccccgcc gcatccgcca gggcctggag 2520 
cgcgccctgc tgtaagatat cggatcctct aga 2553 



<210> 67 
<211> 2340 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gpl60.modUS4 .delVl/V2 



<400> 67 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccaccacc 
cccgtgtgga aggaggccac caccaccctg 
gccgaggccc acaacgtgtg ggccacccac 
gaggtgaacc tgaccaacgt gaccgagaac 
cagatgcatg aggacatcat cagcctgtgg 
ggccaggcct gccccaaggt gagcttcgag 
ggcttcgcca tcctgaagtg caaggacaag 
gtgagcaccg tgcagtgcac ccacggcatc 
aacggcagcc tggccgagga ggagatcgtg 
aagaccatca tcgtgcagct gaacgagtcc 
aacacgcgta agagcatcca catcggcccc 
atcggcgaca tccgccaggc ccactgcaac 
gagcagatcg tggagaagct gcgcgagcag 
agcagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acaccagcca gctgttcaac 
aagaccaagg agaacgacac catcatcctg 
tggcaggagg tgggcaaggc catgtacgcc 
agcaatatta ccggcctgct gctgacccgc 
gacaccgaga ccttccgccc cggcggcggc 
tacaagtaca aggtggtgcg catcgagccc 
cgcgtggtgc agcgcgagaa gcgcgccgtg 
ggcgccgccg ggagcaccat gggcgccgcc 
ctgctgagcg gcatcgtgca gcagcagaac 
cacctgctgc agctgaccgt gtggggcatc 
gagcgctacc tgaaggacca gcagctgctg 
tgcaccacca ccgtgccctg gaacagcagc 
gacaacatga cctggatgga gtgggagcgc 
aacctgatcg agatcgccca gaaccagcag 
gacaagtggg ccagcctgtg gaactggttc 
atcttcatca tgatcgtggg cggcctgatc 
atcgtgaacc gcgtgcgcca gggctacagc 
cagcgcggcc ccgaccgccc cgagggcatc 
cgcagcaacc gcctggtgca cggcctgctg 
tgcctgttca gctaccaccg cctgcgcgac 
ctgctgggcc gccgcggctg ggaggccctg 



gggctctgct gtgtgctgct gctgtgtgga 60 
gtgctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcttacaag 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgggcgcc 3 60 
cccatcccca tccactactg cgcccccgcc 420 
aagttcaacg gcaccggccc ctgcaagaac 4 80 
cgccccgtgg tgagcaccca gctgctgctg 540 
ctgcgctccg agaacttcac cgacaacgcc 600 
gtggagatca actgcatccg ccccaacaac 660 
ggccgcgcct tctacgccac cggcgacatc 720 
atcagcaagg ccaactggac caacaccctc 780 
ttcggcaaca acaagaccat catcttcaac 840 
ttccacagct tcaactgcgg cggcgagttc 900 
agcacctgga acatcaccga ggaggtgaac 960 
ccctgccgca tccgccagat catcaacatg 1020 
ccccccatcc gcggccagat caagtgcagc 1080 
gacggcggca ccaacaacaa ccgcaccaac 114 0 
aacatgaagg acaactggcg cagcgagctg 1200 
ctgggcgtgg cccccaccca ggccaagcgc 1260 
ggcctgggcg ccctgttcat cggcttcctg 1320 
tccgtgaccc tgaccgtgca ggcccgccag 1380 
aacctgctgc gcgccatcga ggcccagcag 1440 
aagcagctgc aggcccgcat cctggccgtg 1500 
ggcatctggg gctgcagcgg caagctgatc 1560 
tggagcaaca agagcctgac cgagatctgg 1620 
gagatcggca actacaccgg cctgatctac 1680 
gagaagaacg agcaggagct gctggagctg 174 0 
gacatcacca actggctgtg gtacatccgc 1800 
ggcctgcgca tcgtgttcgc cgtgctgagc 1860 
cccatcagcc tgcagacccg cctgcccgcc 1920 
gaggaggagg gcggcgagcg cgaccgcgac 1980 
gccctgatct gggacgacct gcgcagcctg 2 040 
ctgctgctga tcgtggcccg catcgtggag 2100 
aagtactggt ggaacctgct gcagtactgg 2160 
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agccaggagc tgaagagcag cgccgtgagc ctgttcaacg ccaccgccat cgccgtggcc 2220 
gagggcaccg accgcatcat cgagatcgtg cagcgcatct tccgcgccgt gatccacatc 2280 
ccccgccgca tccgccaggg cctggagcgc gccctgctgt aagatatcgg atcctctaga 2340 

<210> 68 
<211> 2385 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60.modUS4del 128-194 

<400> 68 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 
gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc gagaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 4 80 
aagtgcaagg acaagaagtt caacggcacc ggcccctgca agaacgtgag caccgtgcag 54 0 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggaggaga tcgtgctgcg ctccgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaacg agtccgtgga gatcaactgc atccgcccca acaacaacac gcgtaagagc 720 
atccacatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag caaggccaac tggaccaaca ccctcgagca gatcgtggag 840 
aagctgcgcg agcagttcgg caacaacaag accatcatct tcaacagcag cagcggcggc 900 
gaccccgaga tcgtgttcca cagcttcaac tgcggcggcg agttcttcta ctgcaacacc 960 
agccagctgt tcaacagcac ctggaacatc accgaggagg tgaacaagac caaggagaac 102 0 
gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 1080 
aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa tattaccggc 1140 
ctgctgctga cccgcgacgg cggcaccaac aacaaccgca ccaacgacac cgagaccttc 1200 
cgccccggcg gcggcaacat gaaggacaac tggcgcagcg agctgtacaa gtacaaggtg 1260 
gtgcgcatcg agcccctggg cgtggccccc acccaggcca agcgccgcgt ggtgcagcgc 1320 
gagaagcgcg ccgtgggcct gggcgccctg ttcatcggct tcctgggcgc cgccgggagc 13 80 
accatgggcg ccgcctccgt gaccctgacc gtgcaggccc gccagctgct gagcggcatc 1440 
gtgcagcagc agaacaacct gctgcgcgcc atcgaggccc agcagcacct gctgcagctg 1500 
accgtgtggg gcatcaagca gctgcaggcc cgcatcctgg ccgtggagcg ctacctgaag 1560 
gaccagcagc tgctgggcat ctggggctgc agcggcaagc tgatctgcac caccaccgtg 1620 
ccctggaaca gcagctggag caacaagagc ctgaccgaga tctgggacaa catgacctgg 16 80 
atggagtggg agcgcgagat cggcaactac accggcctga tctacaacct gatcgagatc 1740 
gcccagaacc agcaggagaa gaacgagcag gagctgctgg agctggacaa gtgggccagc 1800 
ctgtggaact ggttcgacat caccaactgg ctgtggtaca tccgcatctt catcatgatc 1860 
gtgggcggcc tgatcggcct gcgcatcgtg ttcgccgtgc tgagcatcgt gaaccgcgtg 192 0 
cgccagggct acagccccat cagcctgcag acccgcctgc ccgcccagcg cggccccgac 1980 
cgccccgagg gcatcgagga ggagggcggc gagcgcgacc gcgaccgcag caaccgcctg 2 04 0 
gtgcacggcc tgctggccct gatctgggac gacctgcgca gcctgtgcct gttcagctac 2100 
caccgcctgc gcgacctgct gctgatcgtg gcccgcatcg tggagctgct gggccgccgc 2160 
ggctgggagg ccctgaagta ctggtggaac ctgctgcagt actggagcca ggagctgaag 2220 
agcagcgccg tgagcctgtt caacgccacc gccatcgccg tggccgaggg caccgaccgc 2280 
atcatcgaga tcgtgcagcg catcttccgc gccgtgatcc acatcccccg ccgcatccgc 2340 
cagggcctgg agcgcgccct gctgtaagat atcggatcct ctaga 23 85 

<210> 69 

<211> 144 

<212> DNA 

<213> Human immunodeficiency virus 
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<400> 69 

gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 60 

aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa catcaccggc 120 

ctgctgctga cccgcgacgg cggc 144 

<210> 70 
<211> 144 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 70 

ggaactatca cactcccatg cagaataaaa caaattataa acaggtggca ggaagtagga 60 
aaagcaatgt atgcccctcc catcagagga caaattagat gctcatcaaa tattacagga 120 
ctgctattaa caagagatgg tggt 144 

<210> 71 
<211> 144 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Env 
US4 common region 

<400> 71 

gacaccatca tcctgccctg ccgcatccgc cagatcatca acatgtggca ggaggtgggc 60 

aaggccatgt acgccccccc catccgcggc cagatcaagt gcagcagcaa catcaccggc 12 0 

ctgctgctga cccgcgacgg cggc 144 

<210> 72 
<211> 144 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic Env 
SF162 common region 

<400> 72 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctggca ggaggtgggc 60 

aaggccatgt acgccccccc catccgcggc cagatccgct gcagcagcaa catcaccggc 120 
ctgctgctga cccgcgacgg cggc 144 

<210> 73 
<211> 4766 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl60 .modUS4 .gag.modSF2 

<400> 73 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcttacaag 180 

gccgaggccc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 24 0 

gaggtgaacc tgaccaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcatg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
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acccccctgt gcgtgaccct gaactgcacc 
aacagcacca gcggcaccaa cagcaccagc 
gacagctggg agaagatgcc cgagggcgag 
agcgtgcgcg acaaggtgca gaaggagtac 
atcgacaacg acaacgccag ctaccgcctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaagga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggaggagat cgtgctgcgc 
atcatcgtgc agctgaacga gtccgtggag 
cgtaagagca tccacatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtggaga agctgcgcga gcagttcggc 
agcggcggcg accccgagat cgtgttccac 
tgcaacacca gccagctgtt caacagcacc 
aaggagaacg acaccatcat cctgccctgc 
gaggtgggca aggccatgta cgcccccccc 
attaccggcc tgctgctgac ccgcgacggc 
gagaccttcc gccccggcgg cggcaacatg 
tacaaggtgg tgcgcatcga gcccctgggc 
gtgcagcgcg agaagcgcgc cgtgggcctg 
gccgggagca ccatgggcgc cgcctccgtg 
agcggcatcg tgcagcagca gaacaacctg 
ctgcagctga ccgtgtgggg catcaagcag 
tacctgaagg accagcagct gctgggcatc 
accaccgtgc cctggaacag cagctggagc 
atgacctgga tggagtggga gcgcgagatc 
atcgagatcg cccagaacca gcaggagaag 
tgggccagcc tgtggaactg gttcgacatc 
atcatgatcg tgggcggcct gatcggcctg 
aaccgcgtgc gccagggcta cagccccatc 
ggccccgacc gccccgaggg catcgaggag 
aaccgcctgg tgcacggcct gctggccctg 
ttcagctacc accgcctgcg cgacctgctg 
ggccgccgcg gctgggaggc cctgaagtac 
gagctgaaga gcagcgccgt gagcctgttc 
accgaccgca tcatcgagat cgtgcagcgc 
cgcatccgcc agggcctgga gcgcgccctg 
cgcccccccc cccccccccc ctctccctcc 
ttggaataag gccggtgtgc gtttgtctat 
ggcaatgtga gggcccggaa acctggccct 
tcccctctcg ccaaaggaat gcaaggtctg 
gaagcttctt gaagacaaac aacgtctgta 
cctggcgaca ggtgcctctg cggccaaaag 
gcacaacccc agtgccacgt tgtgagttgg 
tcaagcgtat tcaacaaggg gctgaaggat 
gatctggggc ctcggtgcac atgctttaca 
ggccccccga accacgggga cgtggttttc 
ccgcgccagc gtgctgagcg gcggcgagct 
cggcggcaag aagaagtaca agctgaagca 
cttcgccgtg aaccccggcc tgctggagac 
gctgcagccc agcctgcaga ccggcagcga 
caccctgtac tgcgtgcacc agcgcatcga 
gatcgaggag gagcagaaca agtccaagaa 
caccggcaac agcagccagg tgagccagaa 
gatggtgcac caggccatca gcccccgcac 
gaaggccttc agccccgagg tgatccccat 
ccaggacctg aacacgatgt tgaacaccgt 
gaaggagacc atcaacgagg aggccgccga 



gacaagctga ccggcagcac caacggcacc 420 
ggcaccaaca gcaccagcac caacagcacc 480 
atcaagaact gcagcttcaa catcaccacc 540 
agcctgttct acaagctgga cgtggtgccc 600 
atcaactgca acaccagcgt gatcacccag 660 
cccatccact actgcgcccc cgccggcttc 720 
aacggcaccg gcccctgcaa gaacgtgagc 780 
gtggtgagca cccagctgct gctgaacggc 84 0 
tccgagaact tcaccgacaa cgccaagacc 900 
atcaactgca tccgccccaa caacaacacg 960 
gccttctacg ccaccggcga catcatcggc 1020 
aaggccaact ggaccaacac cctcgagcag 1080 
aacaacaaga ccatcatctt caacagcagc 1140 
agcttcaact gcggcggcga gttcttctac 1200 
tggaacatca ccgaggaggt gaacaagacc 12 60 
cgcatccgcc agatcatcaa catgtggcag 1320 
atccgcggcc agatcaagtg cagcagcaat 13 80 
ggcaccaaca acaaccgcac caacgacacc 144 0 
aaggacaact ggcgcagcga gctgtacaag 1500 
gtggccccca cccaggccaa gcgccgcgtg 1560 
ggcgccctgt tcatcggctt cctgggcgcc 1620 
accctgaccg tgcaggcccg ccagctgctg 1680 
ctgcgcgcca tcgaggccca gcagcacctg 1740 
ctgcaggccc gcatcctggc cgtggagcgc 1800 
tggggctgca gcggcaagct gatctgcacc 1860 
aacaagagcc tgaccgagat ctgggacaac 1920 
ggcaactaca ccggcctgat ctacaacctg 1980 
aacgagcagg agctgctgga gctggacaag 2040 
accaactggc tgtggtacat ccgcatcttc 2100 
cgcatcgtgt tcgccgtgct gagcatcgtg 2160 
agcctgcaga cccgcctgcc cgcccagcgc 2220 
gagggcggcg agcgcgaccg cgaccgcagc 22 80 
atctgggacg acctgcgcag cctgtgcctg 2340 
ctgatcgtgg cccgcatcgt ggagctgctg 24 00 
tggtggaacc tgctgcagta ctggagccag 24 60 
aacgccaccg ccatcgccgt ggccgagggc 2520 
atcttccgcg ccgtgatcca catcccccgc 2580 
ctgtaagata tcggatcctc tagagaattc 2640 
ccccccccta acgttactgg ccgaagccgc 2700 
atgttatttt ccaccatatt gccgtctttt 2760 
gtcttcttga cgagcattcc taggggtctt 2820 
ttgaatgtcg tgaaggaagc agttcctctg 2 880 
gcgacccttt gcaggcagcg gaacccccca 2 94 0 
ccacgtgtat aagatacacc tgcaaaggcg 3000 
atagttgtgg aaagagtcaa atggctctcc 3060 
gcccagaagg taccccattg tatgggatct 3120 
tgtgtttagt cgaggttaaa aaaacgtcta 3180 
ctttgaaaaa cacgataata ccatgggcgc 3240 
ggacaagtgg gagaagatcc gcctgcgccc 33 00 
catcgtgtgg gccagccgcg agctggagcg 3360 
cagcgagggc tgccgccaga tcctgggcca 3420 
ggagctgcgc agcctgtaca acaccgtggc 34 80 
cgtcaaggac accaaggagg ccctggagaa 3540 
gaaggcccag caggccgccg ccgccgccgg 3600 
ctaccccatc gtgcagaacc tgcagggcca 3660 
cctgaacgcc tgggtgaagg tggtggagga 3720 
gttcagcgcc ctgagcgagg gcgccacccc 3780 
gggcggccac caggccgcca tgcagatgct 3 840 
gtgggaccgc gtgcaccccg tgcacgccgg 3 900 
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ccccatcgcc cccggccaga tgcgcgagcc ccgcggcagc gacatcgccg gcaccaccag 3 960 

caccctgcag gagcagatcg gctggatgac caacaacccc cccatccccg tgggcgagat 4020 

ctacaagcgg tggatcatcc tgggcctgaa caagatcgtg cggatgtaca gccccaccag 4 080 

catcctggac atccgccagg gccccaagga gcccttccgc gactacgtgg accgcttcta 4140 

caagaccctg cgcgctgagc aggccagcca ggacgtgaag aactggatga ccgagaccct 4200 

gctggtgcag aacgccaacc ccgactgcaa gaccatcctg aaggctctcg gccccgcggc 4260 

caccctggag gagatgatga ccgcctgcca gggcgtgggc ggccccggcc acaaggcccg 4320 

cgtgctggcc gaggcgatga gccaggtgac gaacccggcg accatcatga tgcagcgcgg 4380 

caacttccgc aaccagcgga agaccgtcaa gtgcttcaac tgcggcaagg agggccacac 444 0 

cgccaggaac tgccgcgccc cccgcaagaa gggctgctgg cgctgcggcc gcgagggcca 4 500 

ccagatgaag gactgcaccg agcgccaggc caacttcctg ggcaagatct ggcccagcta 4560 

caagggccgc cccggcaact tcctgcagag ccgccccgag cccaccgccc cccccgagga 4620 

gagcttccgc ttcggcgagg agaagaccac ccccagccag aagcaggagc ccatcgacaa 4680 

ggagctgtac cccctgacca gcctgcgcag cctgttcggc aacgacccca gcagccagta 4740 

agaattcaga ctcgagcaag tctaga 4766 

<210> 74 
<211> 4689 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
gp!60 - modSF162 . gag .modSF2 



<400> 74 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
agcatccgca acaagatgca gaaggagtac 
atcgacaacg acaacaccag ctacaagctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaacga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggagggcgt ggtgatccgc 
atcatcgtgc agctgaagga gagcgtggag 
cgcaagagca tcaccatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
aaggccatgt acgccccccc catccgcggc 
ctgctgctga cccgcgacgg cggcaaggag 
ggcggcggcg acatgcgcga caactggcgc 
atcgagcccc tgggcgtggc ccccaccaag 
cgcgccgtga ccctgggcgc catgttcctg 
ggcgcccgca gcctgaccct gaccgtgcag 
cagcagaaca acctgctgcg cgccatcgag 
tggggcatca agcagctgca ggcccgcgtg 
cagctgctgg gcatctgggg ctgcagcggc 
aacgccagct ggagcaacaa gagcctggac 
tgggagcgcg agatcgacaa ctacaccaac 
aaccagcagg agaagaacga gcaggagctg 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 3 00 
gaccagagcc tgaagccctg cgtgaagctg 360 
aacctgaaga acgccaccaa caccaagagc 42 0 
atcaagaact gcagcttcaa ggtgaccacc 4 80 
gccctgttct acaagctgga cgtggtgccc 54 0 
atcaactgca acaccagcgt gatcacccag 600 
cccatccact actgcgcccc cgccggcttc 660 
aacggcagcg gcccctgcac caacgtgagc 72 0 
gtggtgagca cccagctgct gctgaacggc 780 
agcgagaact tcaccgacaa cgccaagacc 84 0 
atcaactgca cccgccccaa caacaacacc 900 
gccttctacg ccaccggcga catcatcggc 960 
ggcgagaagt ggaacaacac cctgaagcag 102 0 
aacaagacca tcgtgttcaa gcagagcagc 108 0 
ttcaactgcg gcggcgagtt cttctactgc 1140 
aacaacacca tcggccccaa caacaccaac 12 00 
cagatcatca accgctggca ggaggtgggc 1260 
cagatccgct gcagcagcaa catcaccggc 1320 
atcagcaaca ccaccgagat cttccgcccc 1380 
agcgagctgt acaagtacaa ggtggtgaag 1440 
gccaagcgcc gcgtggtgca gcgcgagaag 15 00 
ggcttcctgg gcgccgccgg cagcaccatg 1560 
gcccgccagc tgctgagcgg catcgtgcag 1620 
gcccagcagc acctgctgca gctgaccgtg 1680 
ctggccgtgg agcgctacct gaaggaccag 1740 
aagctgatct gcaccaccgc cgtgccctgg 1800 
cagatctgga acaacatgac ctggatggag 1860 
ctgatctaca ccctgatcga ggagagccag 1920 
ctggagctgg acaagtgggc cagcctgtgg 1980 
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aactggttcg acatcagcaa gtggctgtgg tacatcaaga tcttcatcat gatcgtgggc 2 04 0 
ggcctggtgg gcctgcgcat cgtgttcacc gtgctgagca tcgtgaaccg cgtgcgccag 2100 
ggctacagcc ccctgagctt ccagacccgc ttccccgccc cccgcggccc cgaccgcccc 2160 
gagggcatcg aggaggaggg cggcgagcgc gaccgcgacc gcagcagccc cctggtgcac 2220 
ggcctgctgg ccctgatctg ggacgacctg cgcagcctgt gcctgttcag ctaccaccgc 2280 
ctgcgcgacc tgatcctgat cgccgcccgc atcgtggagc tgctgggccg ccgcggctgg 2340 
gaggccctga agtactgggg caacctgctg cagtactgga tccaggagct gaagaacagc 24 00 
gccgtgagcc tgttcgacgc catcgccatc gccgtggccg agggcaccga ccgcatcatc 2460 
gaggtggccc agcgcatcgg ccgcgccttc ctgcacatcc cccgccgcat ccgccagggc 2520 
ttcgagcgcg ccctgctgta actcgagcaa gtctagagaa ttccgccccc cccccccccc 2580 
cccctctccc tccccccccc ctaacgttac tggccgaagc cgcttggaat aaggccggtg 2640 
tgcgtttgtc tatatgttat tttccaccat attgccgtct tttggcaatg tgagggcccg 2700 
gaaacctggc cctgtcttct tgacgagcat tcctaggggt ctttcccctc tcgccaaagg 2760 
aatgcaaggt ctgttgaatg tcgtgaagga agcagttcct ctggaagctt cttgaagaca 2820 
aacaacgtct gtagcgaccc tttgcaggca gcggaacccc ccacctggcg acaggtgcct 2880 
ctgcggccaa aagccacgtg tataagatac acctgcaaag gcggcacaac cccagtgcca 2940 
cgttgtgagt tggatagttg tggaaagagt caaatggctc tcctcaagcg tattcaacaa 3000 
ggggctgaag gatgcccaga aggtacccca ttgtatggga tctgatctgg ggcctcggtg 3060 
cacatgcttt acatgtgttt agtcgaggtt aaaaaaacgt ctaggccccc cgaaccacgg 3120 
ggacgtggtt ttcctttgaa aaacacgata ataccatggg cgcccgcgcc agcgtgctga 3180 
gcggcggcga gctggacaag tgggagaaga tccgcctgcg ccccggcggc aagaagaagt 3 24 0 
acaagctgaa gcacatcgtg tgggccagcc gcgagctgga gcgcttcgcc gtgaaccccg 3300 
gcctgctgga gaccagcgag ggctgccgcc agatcctggg ccagctgcag cccagcctgc 3360 
agaccggcag cgaggagctg cgcagcctgt acaacaccgt ggccaccctg tactgcgtgc 342 0 
accagcgcat cgacgtcaag gacaccaagg aggccctgga gaagatcgag gaggagcaga 3480 
acaagtccaa gaagaaggcc cagcaggccg ccgccgccgc cggcaccggc aacagcagcc 3540 
aggtgagcca gaactacccc atcgtgcaga acctgcaggg ccagatggtg caccaggcca 3600 
tcagcccccg caccctgaac gcctgggtga aggtggtgga ggagaaggcc ttcagccccg 3660 
aggtgatccc catgttcagc gccctgagcg agggcgccac cccccaggac ctgaacacga 3720 
tgttgaacac cgtgggcggc caccaggccg ccatgcagat gctgaaggag accatcaacg 3780 
aggaggccgc cgagtgggac cgcgtgcacc ccgtgcacgc cggccccatc gcccccggcc 3 84 0 
agatgcgcga gccccgcggc agcgacatcg ccggcaccac cagcaccctg caggagcaga 3900 
tcggctggat gaccaacaac ccccccatcc ccgtgggcga gatctacaag cggtggatca 3960 
tcctgggcct gaacaagatc gtgcggatgt acagccccac cagcatcctg gacatccgcc 4020 
agggccccaa ggagcccttc cgcgactacg tggaccgctt ctacaagacc ctgcgcgctg 4080 
agcaggccag ccaggacgtg aagaactgga tgaccgagac cctgctggtg cagaacgcca 4140 
accccgactg caagaccatc ctgaaggctc tcggccccgc ggccaccctg gaggagatga 4200 
tgaccgcctg ccagggcgtg ggcggccccg gccacaaggc ccgcgtgctg gccgaggcga 4260 
tgagccaggt gacgaacccg gcgaccatca tgatgcagcg cggcaacttc cgcaaccagc 4320 
ggaagaccgt caagtgcttc aactgcggca aggagggcca caccgccagg aactgccgcg 4380 
ccccccgcaa gaagggctgc tggcgctgcg gccgcgaggg ccaccagatg aaggactgca 4440 
ccgagcgcca ggccaacttc ctgggcaaga tctggcccag ctacaagggc cgccccggca 4500 
acttcctgca gagccgcccc gagcccaccg ccccccccga ggagagcttc cgcttcggcg 4560 
aggagaagac cacccccagc cagaagcagg agcccatcga caaggagctg taccccctga 4620 
ccagcctgcg cagcctgttc ggcaacgacc ccagcagcca gtaagaattc agactcgagc 4680 
aagtctaga 4689 

<210> 75 
<211> 4472 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
gpl6 0.modUS4 . delVl/V2 . gag . modSF2 

<400> 75 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccaccacc gtgctgtggg tgaccgtgta ctacggcgtg 120 
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cccgtgtgga aggaggccac caccaccctg 
gccgaggccc acaacgtgtg ggccacccac 
gaggtgaacc tgaccaacgt gaccgagaac 
cagatgcatg aggacatcat cagcctgtgg 
ggccaggcct gccccaaggt gagcttcgag 
ggcttcgcca tcctgaagtg caaggacaag 
gtgagcaccg tgcagtgcac ccacggcatc 
aacggcagcc tggccgagga ggagatcgtg 
aagaccatca tcgtgcagct gaacgagtcc 
aacacgcgta agagcatcca catcggcccc 
atcggcgaca tccgccaggc ccactgcaac 
gagcagatcg tggagaagct gcgcgagcag 
agcagcagcg gcggcgaccc cgagatcgtg 
ttctactgca acaccagcca gctgttcaac 
aagaccaagg agaacgacac catcatcctg 
tggcaggagg tgggcaaggc catgtacgcc 
agcaatatta ccggcctgct gctgacccgc 
gacaccgaga ccttccgccc cggcggcggc 
tacaagtaca aggtggtgcg catcgagccc 
cgcgtggtgc agcgcgagaa gcgcgccgtg 
ggcgccgccg ggagcaccat gggcgccgcc 
ctgctgagcg gcatcgtgca gcagcagaac 
cacctgctgc agctgaccgt gtggggcatc 
gagcgctacc tgaaggacca gcagctgctg 
tgcaccacca ccgtgccctg gaacagcagc 
gacaacatga cctggatgga gtgggagcgc 
aacctgatcg agatcgccca gaaccagcag 
gacaagtggg ccagcctgtg gaactggttc 
atcttcatca tgatcgtggg cggcctgatc 
atcgtgaacc gcgtgcgcca gggctacagc 
cagcgcggcc ccgaccgccc cgagggcatc 
cgcagcaacc gcctggtgca cggcctgctg 
tgcctgttca gctaccaccg cctgcgcgac 
ctgctgggcc gccgcggctg ggaggccctg 
agccaggagc tgaagagcag cgccgtgagc 
gagggcaccg accgcatcat cgagatcgtg 
ccccgccgca tccgccaggg cctggagcgc 
gaattccgcc cccccccccc ccccccctct 
agccgcttgg aataaggccg gtgtgcgttt 
tcttttggca atgtgagggc ccggaaacct 
ggtctttccc ctctcgccaa aggaatgcaa 
cctctggaag cttcttgaag acaaacaacg 
cccccacctg gcgacaggtg cctctgcggc 
aaggcggcac aaccccagtg ccacgttgtg 
ctctcctcaa gcgtattcaa caaggggctg 
ggatctgatc tggggcctcg gtgcacatgc 
cgtctaggcc ccccgaacca cggggacgtg 
gggcgcccgc gccagcgtgc tgagcggcgg 
gcgccccggc ggcaagaaga agtacaagct 
ggagcgcttc gccgtgaacc ccggcctgct 
gggccagctg cagcccagcc tgcagaccgg 
cgtggccacc ctgtactgcg tgcaccagcg 
ggagaagatc gaggaggagc agaacaagtc 
cgccggcacc ggcaacagca gccaggtgag 
gggccagatg gtgcaccagg ccatcagccc 
ggaggagaag gccttcagcc ccgaggtgat 
caccccccag gacctgaaca cgatgttgaa 
gatgctgaag gagaccatca acgaggaggc 
cgccggcccc atcgcccccg gccagatgcg 



ttctgcgcca gcgacgccaa ggcttacaag 180 
gcctgcgtgc ccaccgaccc caacccccag 24 0 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgggcgcc 36 0 
cccatcccca tccactactg cgcccccgcc 420 
aagttcaacg gcaccggccc ctgcaagaac 480 
cgccccgtgg tgagcaccca gctgctgctg 54 0 
ctgcgctccg agaacttcac cgacaacgcc 600 
gtggagatca actgcatccg ccccaacaac 660 
ggccgcgcct tctacgccac cggcgacatc 720 
atcagcaagg ccaactggac caacaccctc 780 
ttcggcaaca acaagaccat catcttcaac 840 
ttccacagct tcaactgcgg cggcgagttc 900 
agcacctgga acatcaccga ggaggtgaac 960 
ccctgccgca tccgccagat catcaacatg 1020 
ccccccatcc gcggccagat caagtgcagc 1080 
gacggcggca ccaacaacaa ccgcaccaac 1140 
aacatgaagg acaactggcg cagcgagctg 12 00 
ctgggcgtgg cccccaccca ggccaagcgc 1260 
ggcctgggcg ccctgttcat cggcttcctg 1320 
tccgtgaccc tgaccgtgca ggcccgccag 13 80 
aacctgctgc gcgccatcga ggcccagcag 144 0 
aagcagctgc aggcccgcat cctggccgtg 1500 
ggcatctggg gctgcagcgg caagctgatc 1560 
tggagcaaca agagcctgac cgagatctgg 1620 
gagatcggca actacaccgg cctgatctac 1680 
gagaagaacg agcaggagct gctggagctg 174 0 
gacatcacca actggctgtg gtacatccgc 1800 
ggcctgcgca tcgtgttcgc cgtgctgagc 1860 
cccatcagcc tgcagacccg cctgcccgcc 192 0 
gaggaggagg gcggcgagcg cgaccgcgac 1980 
gccctgatct gggacgacct gcgcagcctg 2 040 
ctgctgctga tcgtggcccg catcgtggag 2100 
aagtactggt ggaacctgct gcagtactgg 2160 
ctgttcaacg ccaccgccat cgccgtggcc 222 0 
cagcgcatct tccgcgccgt gatccacatc 2280 
gccctgctgt aagatatcgg atcctctaga 2340 
ccctcccccc cccctaacgt tactggccga 2400 
gtctatatgt tattttccac catattgccg 2460 
ggccctgtct tcttgacgag cattcctagg 2520 
ggtctgttga atgtcgtgaa ggaagcagtt 2580 
tctgtagcga ccctttgcag gcagcggaac 264 0 
caaaagccac gtgtataaga tacacctgca 2700 
agttggatag ttgtggaaag agtcaaatgg 2760 
aaggatgccc agaaggtacc ccattgtatg 2 820 
tttacatgtg tttagtcgag gttaaaaaaa 2880 
gttttccttt gaaaaacacg ataataccat 2940 
cgagctggac aagtgggaga agatccgcct 3000 
gaagcacatc gtgtgggcca gccgcgagct 3060 
ggagaccagc gagggctgcc gccagatcct 3120 
cagcgaggag ctgcgcagcc tgtacaacac 3180 
catcgacgtc aaggacacca aggaggccct 324 0 
caagaagaag gcccagcagg ccgccgccgc 3300 
ccagaactac cccatcgtgc agaacctgca 3360 
ccgcaccctg aacgcctggg tgaaggtggt 3420 
ccccatgttc agcgccctga gcgagggcgc 3480 
caccgtgggc ggccaccagg ccgccatgca 354 0 
cgccgagtgg gaccgcgtgc accccgtgca 3600 
cgagccccgc ggcagcgaca tcgccggcac 3660 
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tgcccgtgaa gctgaagccg gggatggacg gccccaaggt caagcagtgg cccctgtaag 1860 
aattc 1865 

<210> 79 
<211> 1865 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: GP2 
<400> 79 

gtcgacgcca ccatgggcgc ccgcgccagc gtgctgagcg gcggcgagct ggacaagtgg 60 
gagaagatcc gcctgcgccc cggcggcaag aagaagtaca agctgaagca catcgtgtgg 12 0 
gccagccgcg agctggagcg cttcgccgtg aaccccggcc tgctggagac cagcgagggc 180 
tgccgccaga tcctgggcca gctgcagccc agcctgcaga ccggcagcga ggagctgcgc 24 0 
agcctgtaca acaccgtggc caccctgtac tgcgtgcacc agcgcatcga cgtcaaggac 300 
accaaggagg ccctggagaa gatcgaggag gagcagaaca agtccaagaa gaaggcccag 360 
caggccgccg ccgccgccgg caccggcaac agcagccagg tgagccagaa ctaccccatc 420 
gtgcagaacc tgcagggcca gatggtgcac caggccatca gcccccgcac cctgaacgcc 480 
tgggtgaagg tggtggagga gaaggccttc agccccgagg tgatccccat gttcagcgcc 54 0 
ctgagcgagg gcgccacccc ccaggacctg aacacgatgt tgaacaccgt gggcggccac 600 
caggccgcca tgcagatgct gaaggagacc atcaacgagg aggccgccga gtgggaccgc 660 
gtgcaccccg tgcacgccgg ccccatcgcc cccggccaga tgcgcgagcc ccgcggcagc 72 0 
gacatcgccg gcaccaccag caccctgcag gagcagatcg gctggatgac caacaacccc 780 
cccatccccg tgggcgagat ctacaagcgg tggatcatcc tgggcctgaa caagatcgtg 840 
cggatgtaca gccccaccag catcctggac atccgccagg gccccaagga gcccttccgc 900 
gactacgtgg accgcttcta caagaccctg cgcgctgagc aggccagcca ggacgtgaag 96 0 
aactggatga ccgagaccct gctggtgcag aacgccaacc ccgactgcaa gaccatcctg 102 0 
aaggctctcg gccccgcggc caccctggag gagatgatga ccgcctgcca gggcgtgggc 1080 
ggccccggcc acaaggcccg cgtgctggcc gaggcgatga gccaggtgac gaacccggcg 114 0 
accatcatga tgcagcgcgg caacttccgc aaccagcgga agaccgtcaa gtgcttcaac 1200 
tgcggcaagg agggccacac cgccaggaac tgccgcgccc cccgcaagaa gggctgctgg 1260 
cgctgcggcc gcgaaggaca ccaaatgaaa gattgcactg agagacaggc taatttttta 132 0 
gggaagatct ggccttccta caagggaagg ccagggaatt ttcttcagag cagaccagag 1380 
ccaacagccc caccagaaga gagcttcagg tttggggagg agaaaacaac tccctctcag 144 0 
aagcaggagc cgatagacaa ggaactgtat cctttaactt ccctcagatc actctttggc 1500 
aacgacccct cgtcacagta aggatcgggg ggcaactcaa ggaagcgctg ctcgatacag 1560 
gagcagatga tacagtatta gaagaaatga atttgccagg aaaatggaaa ccaaaaatga 1620 
taggggggat cgggggcttc atcaaggtga ggcagtacga ccagatacct gtagaaatct 1680 
gtggacataa agctataggt acagtattag taggacctac acctgtcaac ataattggaa 1740 
gaaatctgtt gacccagatc ggctgcacct tgaacttccc catcagccct attgagacgg 1800 
tgcccgtgaa gttgaagccg gggatggacg gccccaaggt caagcaatgg ccattgtaag 1860 
aattc 1865 

<210> 80 
<211> 2305 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
FS ( + ) . proinact . RTopt . YM 



<400> 80 

gcggccgcga aggacaccaa atgaaagatt 

agatctggcc ttcctacaag ggaaggccag 

cagccccacc agaagagagc ttcaggtttg 

aggagccgat agacaaggaa ctgtatcctt 



gcactgagag acaggctaat tttttaggga 60 
ggaattttct tcagagcaga ccagagccaa 120 
gggaggagaa aacaactccc tctcagaagc 180 
taacttccct cagatcactc tttggcaacg 24 0 
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ggtgcagctg ggcatccccc accccgccgg cctgaagaag aagaagagcg tgaccgtgct 840 
ggacgtgggc gacgcctact tcagcgtgcc cctggacaag gacttccgca agtacaccgc 900 
cttcaccatc cccagcatca acaacgagac ccccggcatc cgctaccagt acaacgtgct 960 
gccccagggc tggaagggca gccccgccat cttccagagc agcatgacca agatcctgga 1020 
gcccttccgc aagcagaacc ccgacatcgt gatctaccag gcccccctgt acgtgggcag 1080 
cgacctggag atcggccagc accgcaccaa gatcgaggag ctgcgccagc acctgctgcg 114 0 
ctggggcttc accacccccg acaagaagca ccagaaggag ccccccttcc tgcccatcga 12 00 
gctgcacccc gacaagtgga ccgtgcagcc catcatgctg cccgagaagg acagctggac 1260 
cgtgaacgac atccagaagc tggtgggcaa gctgaactgg gccagccaga tctacgccgg 1320 
catcaaggtg aagcagctgt gcaagctgct gcgcggcacc aaggccctga ccgaggtgat 1380 
ccccctgacc gaggaggccg agctggagct ggccgagaac cgcgagatcc tgaaggagcc 1440 
cgtgcacgag gtgtactacg accccagcaa ggacctggtg gccgagatcc agaagcaggg 1500 
ccagggccag tggacctacc agatctacca ggagcccttc aagaacctga agaccggcaa 1560 
gtacgcccgc atgcgcggcg cccacaccaa cgacgtgaag cagctgaccg aggccgtgca 1620 
gaaggtgagc accgagagca tcgtgatctg gggcaagatc cccaagttca agctgcccat 1680 
ccagaaggag acctgggagg cctggtggat ggagtactgg caggccacct ggatccccga 174 0 
gtgggagttc gtgaacaccc cccccctggt gaagctgtgg taccagctgg agaaggagcc 1800 
catcgtgggc gccgagacct tctacgtgga cggcgccgcc aaccgcgaga ccaagctggg 1860 
caaggccggc tacgtgaccg accggggccg gcagaaggtg gtgagcatcg ccgacaccac 1920 
caaccagaag accgagctgc aggccatcca cctggccctg caggacagcg gcctggaggt 1980 
gaacatcgtg accgacagcc agtacgccct gggcatcatc caggcccagc ccgacaagag 204 0 
cgagagcgag ctggtgagcc agatcatcga gcagctgatc aagaaggaga aggtgtacct 2100 
ggcctgggtg cccgcccaca agggcatcgg cggcaacgag caggtggaca agctggtgag 2160 
cgccggcatc cgcaaggtgc tgttcctgaa cggcatcgat ggcggcatcg tgatctacca 2220 
gtacatggac gacctgtacg tgggcagcgg cggccctagg atcgattaaa agcttcccgg 22 80 
ggctagcacc ggtgaattc 2299 

<210> 82 
<211> 2306 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
FS f - ) . protmod . RTopt . YM 

<400> 82 

gcggccgcga aggacaccaa atgaaagatt gcactgagag acaggctaat ttcttccgcg 60 
aggacctggc cttcctgcag ggcaaggccc gcgagttcag cagcgagcag acccgcgcca 120 
acagccccac ccgccgcgag ctgcaggtgt ggggcggcga gaacaacagc ctgagcgagg 180 
ccggcgccga ccgccagggc accgtgagct tcaacttccc ccagatcacc ctgtggcagc 24 0 
gccccctggt gaccatcagg atcggcggcc agctcaagga ggcgctgctc gacaccggcg 3 00 
ccgacgacac cgtgctggag gagatgaacc tgcccggcaa gtggaagccc aagatgatcg 3 60 
gcgggatcgg gggcttcatc aaggtgcggc agtacgacca gatccccgtg gagatctgcg 420 
gccacaaggc catcggcacc gtgctggtgg gccccacccc cgtgaacatc atcggccgca 4 80 
acctgctgac ccagatcggc tgcaccctga acttccccat cagccccatc gagacggtgc 540 
ccgtgaagct gaagccgggg atggacggcc ccaaggtcaa gcagtggccc ctgaccgagg 6 00 
agaagatcaa ggccctggtg gagatctgca ccgagatgga gaaggagggc aagatcagca 660 
agatcggccc cgagaacccc tacaacaccc ccgtgttcgc catcaagaag aaggacagca 720 
ccaagtggcg caagctggtg gacttccgcg agctgaacaa gcgcacccag gacttctggg 780 
aggtgcagct gggcatcccc caccccgccg gcctgaagaa gaagaagagc gtgaccgtgc 840 
tggacgtggg cgacgcctac ttcagcgtgc ccctggacaa ggacttccgc aagtacaccg 900 
ccttcaccat ccccagcatc aacaacgaga cccccggcat ccgctaccag tacaacgtgc 960 
tgccccaggg ctggaagggc agccccgcca tcttccagag cagcatgacc aagatcctgg 1020 
agcccttccg caagcagaac cccgacatcg tgatctacca ggcccccctg tacgtgggca 1080 
gcgacctgga gatcggccag caccgcacca agatcgagga gctgcgccag cacctgctgc 1140 
gctggggctt caccaccccc gacaagaagc accagaagga gccccccttc ctgtggatgg 1200 
gctacgagct gcaccccgac aagtggaccg tgcagcccat catgctgccc gagaaggaca 1260 
gctggaccgt gaacgacatc cagaagctgg tgggcaagct gaactgggcc agccagatct 1320 
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acgccggcat caaggtgaag cagctgtgca agctgctgcg cggcaccaag gccctgaccg 1380 
aggtgatccc cctgaccgag gaggccgagc tggagctggc cgagaaccgc gagatcctga 144 0 
aggagcccgt gcacgaggtg tactacgacc ccagcaagga cctggtggcc gagatccaga 1500 
agcagggcca gggccagtgg acctaccaga tctaccagga gcccttcaag aacctgaaga 1560 
ccggcaagta cgcccgcatg cgcggcgccc acaccaacga cgtgaagcag ctgaccgagg 1620 
ccgtgcagaa ggtgagcacc gagagcatcg tgatctgggg caagatcccc aagttcaagc 1680 
tgcccatcca gaaggagacc tgggaggcct ggtggatgga gtactggcag gccacctgga 174 0 
tccccgagtg ggagttcgtg aacacccccc ccctggtgaa gctgtggtac cagctggaga 1800 
aggagcccat cgtgggcgcc gagaccttct acgtggacgg cgccgccaac cgcgagacca 1860 
agctgggcaa ggccggctac gtgaccgacc ggggccggca gaaggtggtg agcatcgccg 192 0 
acaccaccaa ccagaagacc gagctgcagg ccatccacct ggccctgcag gacagcggcc 1980 
tggaggtgaa catcgtgacc gacagccagt acgccctggg catcatccag gcccagcccg 204 0 
acaagagcga gagcgagctg gtgagccaga tcatcgagca gctgatcaag aaggagaagg 2100 
tgtacctggc ctgggtgccc gcccacaagg gcatcggcgg caacgagcag gtggacaagc 2160 
tggtgagcgc cggcatccgc aaggtgctgt tcctgaacgg catcgatggc ggcatcgtga 222 0 
tctaccagta catggacgac ctgtacgtgg gcagcggcgg ccctaggatc gattaaaagc 2280 
ttcccggggc tagcaccggt gaattc 2306 

<210> 83 
<211> 2300 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
FS ( - ) . protmod . RTopt . YMWM 

<400> 83 

gcggccgcga aggacaccaa atgaaagatt gcactgagag acaggctaat ttcttccgcg 60 
aggacctggc cttcctgcag ggcaaggccc gcgagttcag cagcgagcag acccgcgcca 12 0 
acagccccac ccgccgcgag ctgcaggtgt ggggcggcga gaacaacagc ctgagcgagg 180 
ccggcgccga ccgccagggc accgtgagct tcaacttccc ccagatcacc ctgtggcagc 24 0 
gccccctggt gaccatcagg atcggcggcc agctcaagga ggcgctgctc gacaccggcg 300 
ccgacgacac cgtgctggag gagatgaacc tgcccggcaa gtggaagccc aagatgatcg 360 
gcgggatcgg gggcttcatc aaggtgcggc agtacgacca gatccccgtg gagatctgcg 420 
gccacaaggc catcggcacc gtgctggtgg gccccacccc cgtgaacatc atcggccgca 4 80 
acctgctgac ccagatcggc tgcaccctga acttccccat cagccccatc gagacggtgc 54 0 
ccgtgaagct gaagccgggg atggacggcc ccaaggtcaa gcagtggccc ctgaccgagg 600 
agaagatcaa ggccctggtg gagatctgca ccgagatgga gaaggagggc aagatcagca 660 
agatcggccc cgagaacccc tacaacaccc ccgtgttcgc catcaagaag aaggacagca 72 0 
ccaagtggcg caagctggtg gacttccgcg agctgaacaa gcgcacccag gacttctggg 780 
aggtgcagct gggcatcccc caccccgccg gcctgaagaa gaagaagagc gtgaccgtgc 84 0 
tggacgtggg cgacgcctac ttcagcgtgc ccctggacaa ggacttccgc aagtacaccg 900 
ccttcaccat ccccagcatc aacaacgaga cccccggcat ccgctaccag tacaacgtgc 960 
tgccccaggg ctggaagggc agccccgcca tcttccagag cagcatgacc aagatcctgg 1020 
agcccttccg caagcagaac cccgacatcg tgatctacca ggcccccctg tacgtgggca 1080 
gcgacctgga gatcggccag caccgcacca agatcgagga gctgcgccag cacctgctgc 114 0 
gctggggctt caccaccccc gacaagaagc accagaagga gccccccttc ctgcccatcg 12 00 
agctgcaccc cgacaagtgg accgtgcagc ccatcatgct gcccgagaag gacagctgga 1260 
ccgtgaacga catccagaag ctggtgggca agctgaactg ggccagccag atctacgccg 1320 
gcatcaaggt gaagcagctg tgcaagctgc tgcgcggcac caaggccctg accgaggtga 1380 
tccccctgac cgaggaggcc gagctggagc tggccgagaa ccgcgagatc ctgaaggagc 144 0 
ccgtgcacga ggtgtactac gaccccagca aggacctggt ggccgagatc cagaagcagg 1500 
gccagggcca gtggacctac cagatctacc aggagccctt caagaacctg aagaccggca 1560 
agtacgcccg catgcgcggc gcccacacca acgacgtgaa gcagctgacc gaggccgtgc 1620 
agaaggtgag caccgagagc atcgtgatct ggggcaagat ccccaagttc aagctgccca 1680 
tccagaagga gacctgggag gcctggtgga tggagtactg gcaggccacc tggatccccg 174 0 
agtgggagtt cgtgaacacc ccccccctgg tgaagctgtg gtaccagctg gagaaggagc 1800 
ccatcgtggg cgccgagacc ttctacgtgg acggcgccgc caaccgcgag accaagctgg 1860 
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<210> 85 
<211> 306 
<212> DNA 

<213> Human immunodeficiency virus 
<400> 85 

atggagccag tagatcctag attagagccc tggaagcatc caggaagtca gcctaagact 60 
gcttgtacaa attgctattg taaaaagtgt tgctttcatt gccaagtttg tttcataaca 120 
aaaggcttag gcatctccta tggcaggaag aagcggagac agcgacgaag agctcctcca 180 
gacagtgagg ttcatcaagt ttctctacca aagcaacccg cttcccagcc ccaaggggac 24 0 
ccgacaggcc cgaaggaatc gaagaagaag gtggagagag agacagagac agatccagtc 300 
cattag 306 

<210> 86 
<211> 101 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 86 

Met Glu Pro Val Asp Pro Arg Leu Glu Pro Trp Lys His Pro Gly Ser 
15 10 15 

Gin Pro Lys Thr Ala Cys Thr Asn Cys Tyr Cys Lys Lys Cys Cys Phe 
20 25 30 

His Cys Gin Val Cys Phe He Thr Lys Gly Leu Gly He Ser Tyr Gly 
35 40 45 

Arg Lys Lys Arg Arg Gin Arg Arg Arg Ala Pro Pro Asp Ser Glu Val 
50 55 60 

His Gin Val Ser Leu Pro Lys Gin Pro Ala Ser Gin Pro Gin Gly Asp 
65 70 75 80 

Pro Thr Gly Pro Lys Glu Ser Lys Lys Lys Val Glu Arg Glu Thr Glu 

85 90 95 

Thr Asp Pro Val His 
100 



<210> 87 
<211> 306 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: tat . SF162 . opt 
<400> 87 

atggagcccg tggacccccg cctggagccc tggaagcacc ccggcagcca gcccaagacc 60 
gcctgcacca actgctactg caagaagtgc tgcttccact gccaggtgtg cttcatcacc 120 
aagggcctgg gcatcagcta cggccgcaag aagcgccgcc agcgccgccg cgcccccccc 180 
gacagcgagg tgcaccaggt gagcctgccc aagcagcccg ccagccagcc ccagggcgac 24 0 
cccaccggcc ccaaggagag caagaagaag gtggagcgcg agaccgagac cgaccccgtg 300 
cactag 306 

<210> 88 
<211> 306 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
tat . cys22 . SF162 . opt 

<400> 88 

atggagcccg tggacccccg cctggagccc tggaagcacc ccggcagcca gcccaagacc 60 

gccggcacca actgctactg caagaagtgc tgcttccact gccaggtgtg cttcatcacc 120 

aagggcctgg gcatcagcta cggccgcaag aagcgccgcc agcgccgccg cgcccccccc 180 

gacagcgagg tgcaccaggt gagcctgccc aagcagcccg ccagccagcc ccagggcgac 24 0 

cccaccggcc ccaaggagag caagaagaag gtggagcgcg agaccgagac cgaccccgtg 300 

cactag 306 

<210> 89 
<211> 168 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
tatamino . SF162 . opt 

<400> 89 

atggagcccg tggacccccg cctggagccc tggaagcacc ccggcagcca gcccaagacc 60 

gcctgcacca actgctactg caagaagtgc tgcttccact gccaggtgtg cttcatcacc 120 

aagggcctgg gcatcagcta cggccgcaag aagcgccgcc agcgccgc 168 

<210> 90 
<211> 102 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: tat cys22 
SF162 protein 

<400> 90 

Met Glu Pro Val Asp Pro Arg Leu Glu Pro Trp Lys His Pro Gly Ser 
15 10 15 

Gin Pro Lys Thr Ala Gly Thr Asn Cys Tyr Cys Lys Lys Cys Cys Phe 
20 25 30 

His Cys Gin Val Cys Phe lie Thr Lys Gly Leu Gly lie Ser Tyr Gly 
35 40 45 

Arg Lys Lys Arg Arg Gin Arg Arg Arg Ala Pro Pro Asp Ser Glu Val 
50 55 60 

His Gin Val Ser Leu Pro Lys Gin Pro Ala Ser Gin Pro Gin Gly Asp 
65 70 75 80 

Pro Thr Gly Pro Lys Glu Ser Lys Lys Lys Val Glu Arg Glu Thr Glu 

85 90 95 

Thr Asp Pro Val His Glx 
100 
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